Building Digital Libraries


Course description

This course is an introduction to digital libraries. We look at commonly used concepts of digital libraries. We emphasize the concept a repository as a core component of a digital library. We finally look at digital library services through the use of examples.

The course is almost completely non-technical. Students will be introduced to technological concepts. They will learn what they are. They will not be expected to use them directly. Instead, all interaction with the information system will be intermediated through the use of web interfaces. If a non-web interaction is required, the instructor will preform it for the students.

Course objectives

The course contents will allow students to develop collection and local service strategies based on the use of off-the-shelf technologies. It will also enable them to more competently interact with staff maintaining such technologies.

After taking this course students:


Students should also be able to use a web browser. Everything that goes beyond this is explained in class or by personal tuition from the instructor.

A limited amount of work is conducted on computers. If students want to use lab computers, they are required to have basic skills on Microsoft machines. There include e.g. click on an icon to run a program, cut and paste between applications, and copy files from one location to another. But students may bring they own laptops to class and use the wireless network.

Students are provided with free web space where they can design their own digital library. This web space continues to be available after the course ends. In order to operate that web space students only need a computer with an Internet connection.


Class structure

Classes are held in PC1 at Bobst Library on Tuesdays between 18:20 and 20:10. Between classes, support via the class mailing list will be available.

Each class has a presentation by the instructor. Students are expected to do some of the the work on their digital libraries at home.

Class details

Here are some class details:

2011–09–13 18:30 to 20:20 introduction: from physical to digital libraries
2011–09–20 18:30 to 20:20 origins of digital libraries and origins of repositories
2011–09–27 18:30 to 20:20 repository planning
2011–10–04 18:30 to 20:30 getting to work with Omeka: installation and system overview
2011–10–11 18:30 to 20:30 omeka themes and example plugin
2011–10–18 18:30 to 20:30 omeka metadata and tables
2011–10–25 18:30 to 20:30 presentation of project plan
2011–11–01 18:30 to 20:30 digital imaging
2011–11–08 18:30 to 20:30 presentations on plugins
2011–11–15 18:30 to 20:30 copyright basics I
2011–11–22 18:30 to 20:30 copyright basics II
2011–11–29 18:30 to 20:30 software solutions for repositories and repository interoperability
2011–12–06 18:30 to 20:30 cross-repository services

To print the slides in Microsoft powerpoint, press control-p to print, then under "Print what" choose "Handouts", and under "Color/grayscale" choose "Pure Black and White". You can also use openoffice to print the slides. The slides posted here are drafts until the commencement of the class.


If students want to consult a textbook, Reese and Banerjee (2008) is a reasonable, but book. Consult the instructor about access options before buying it.

A simple, dated, but well-written book is Arms (2005).

Web sites used

The National Information Standards Organization has a Framework of Guidance for Building Good Digital Collections.

The Library of Congress maintains the Metadata Encoding & Transmission Standard: Official Site and the Metadata Object Description Schema: Official Site.

Cornell University bring us the Digital Imaging Tutorial.

Omeka is a project of the Roy Rosenzweig Center for History and New Media at George Mason University. Look at the Omeka Showcase. A simple example of what is doable for a class site is Bridget Harmon's site about the Middle Village Lutheran Cimetery. and Julianna Monjeau's site about The Kosher Meat Boycott of 1902. There is an official list of sites using omeka.

DuraSpace maintain DSpace.

Long Island Memories provide sample simple digital collections we intend to reproduce.

For images, there are useful web sites Aware Systems provide Information about TIFF. Greg Roelofs provides information about PNG.

Background readings

Digital libraries are built by building them, not by reading about them.

On early history, Bush (1945) is a standard reference but it is overrated. Licklider (1965) is much better work in the same direction. You may also enjoy Licklider (1960), Licklider (1968), and Hauben (2005). On more modern approaches to such early systems, see Schatz (1997). On repository history, you may consult historic papers of Van de Sompel et al. (2000) and Davis and Lagoze (2000).

Harter (1996) is an early paper on digital libraries. Borgman (1999) has more about different digital library concepts. Lynch (2005) demonstrates good thinking about directions in digital libraries that are worth discussing if they have become reality.

On copyright, the main reference is Hirtle et al. (2009). We have other useful material in Library of Congress Copyright Office (2010), Scott (1999) and Templeton (2008).

On XML, see the lecture notes of Thomas Krichel lis650 course, as well as the wonderfully short piece of Taylor (2004a). For more theoretical perspective, musings of Janish (2002) and Pitti (2002) are available.

For metadata, Taylor (2004b) gets it right again. Foulonneau and Reiley (2008) provides a general introduction. On METS we have a good background paper by McDonough (2006). Taylor (2004), Taylor (2004),

On the topic of Omeka, we mainly use its web site, but some favorable review by Kuzma, Reiss and Sidman (2010) may provide inspiration.

When we turn to digital repositories, we will mainly use Lynch (2003) Brown and Griffiths (2007) for background and DuraSpace (2011) for how to get things done.

To see how complicated persistent and digital preservation can be, a peek at Consultative Committee for Space Data Systems (2002)

On identifiers, see Kunze (2008) Lynch (2001) Berners-Lee (1998) and Tonkin (2008)

Last, but by no means least, there is a bunch of home-grown resources.

Mailing list

There is a mailing list for the course at All students are encouraged to subscribe. As a rule, answers to email sent to the instructor are copied to the list. There are exceptions to this rule:


Students have a quiz every class bar the first one, and bar the two sessions where the presentations are due. The first presentation is on the project plan on 2011–10–11. The second introduces a plugin of the students choice, but different for each student, on 2011–11–01. Student upload their presentation materials to their web site (as opposed to their omeka sites). For this a roster is available, where the plugin per student is recorded. The four worst performances will be discounted. The average of all quizzes accounts for 10/30 all of the final grade. The quizzes related to all Palmer School objects relevant for the course. Students prepare a small image-based digital library, of 10 to 20 images. This has to be finished by week 10. It will account for 10/30. Finally, there is the assessment of an institutional repository. Each student will be given one institutional repository URL, and a set of metadata records from that repository. The student assesses the content, and the description of the contents. The paper can be quite descriptive. It should not exceed three printed page or 1000 words. It counts for 8/30. It has to be handed at the second before last class meeting.


