Building Digital Libraries


If you are reading a printed copy of this page, you are reading an incomplete version, please print for US letter paper or for A4 sized paper.

Course description

This course introduces students to the—still evolving—concepts of digital libraries. The core subject matter is repository building. Each student builds a public-access repository on the web. This will store born-digital or digitized materials. The course covers related technical material and organizational issues. Potential topics include digitization, the representation of text, file formats for media such as images and sound, identification strategies and technologies, metadata, markup languages, databases, contents delivery technologies, and repository implementation software.

The course is mostly non-technical. Students will be introduced to technological concepts. They will learn what they are. Most of the interaction with the systems will be through web interfaces. For a small amount of work, usage of the ssh protocol is required. This will be covered separately.

Course objectives

After taking this course students:

The Palmer School objective covered by this course is 2.E. "Students will ... build information systems and/or records used in such systems". Students in this course will developing level with respect to this objective.


Students should also be able to use a web browser. Everything that goes beyond this is explained in class or by personal tuition from the instructor.

A limited amount of work is conducted on computers. If students want to use lab computers, they are required to have basic skills on computers using the Microsoft operating system, because this is the type of machines used in the lab. There include e.g. click on an icon to run a program, cut and paste between applications, and copy files from one location to another.

Students are provided with free http/ssh accessible space where they can design their own digital library. This space continues to be available after the course ends. In order to operate that web space students only need a computer with an Internet connection and http/ssh client software.


Thomas Krichel
Palmer School of Library and Information Science
C.W. Post Campus of Long Island University
720 Northern Boulevard
Brookville, NY 11548–1300
work phone: +1–(516)299–2843
skype: thomaskrichel

Class structure

Classes are held in PC1 at Bobst Library on Tuesdays between 13:00 and 18:00. Between classes, support via the class mailing list will be available.

Each class has a presentation by the instructor. Students are expected to do some of the the work on their digital libraries at home.

Class details

Here are some class details:

2012–01–22 12:30 to 17:30 introduction to digital libraries, the ssh protocol
2012–01–29 13:00 to 18:00 early digital library history, omeka installation and internals
2012–02–05 13:00 to 18:00 representation of images, the CSV plugin and project presentations
2012–02–12 no class
2012–02–19 no class
2012–02–26 13:00 to 18:00 representation of text, copyright I, and plugin presentations I
2012–03–04 13:00 to 18:00 copyright II, plugin presentations II
2012–03–11 no class
2012–03–18 no class
2012–03–25 13:00 to 18:00 repository interoperability

To print the slides in Microsoft powerpoint, press control-p to print, then under "Print what" choose "Handouts", and under "Color/grayscale" choose "Pure Black and White". You can also use openoffice to print the slides. The slides posted here are drafts until the commencement of the class.


If students want to consult a textbook, there are two, but none cover the topic at sufficent technical depth. A simple, dated, but well-written book is Arms (2005). The contents is simple but not completely trivial.

Reese and Banerjee (2008) has an overview over the class topic. Unfortunately the contents is mostly trivial and the book is expensive. Students are adviced to contact the instructor about access options before buying the book.

Web sites used

Omeka is a project of the Roy Rosenzweig Center for History and New Media at George Mason University. There is an official list of sites using omeka. Students should also consult the previous edition of this course, as available on the edition listing page to get an idea of minimal standards. The Omeka Showcase may also provide inspiration for wider usage.

The National Information Standards Organization has a Framework of Guidance for Building Good Digital Collections. Cornell University bring us the Digital Imaging Tutorial.

For images, there are useful web sites Aware Systems provide Information about TIFF. Greg Roelofs provides information about PNG.

Background readings

Digital libraries are built by building them, not by reading about them. Nevertheless, we have to note some background readings in the syllabus.

On early history, Bush (1945) is a standard reference but it is overrated. Licklider (1965) is much better work in the same direction. You may also enjoy Licklider (1960), Licklider (1968), and Hauben (2005). On more modern approaches to such early systems, see Schatz (1997). On repository history, you may consult historic papers of Van de Sompel et al. (2000) and Davis and Lagoze (2000).

Harter (1996) is an early paper on digital libraries. Borgman (1999) has more about different digital library concepts. Lynch (2005) demonstrates good thinking about directions in digital libraries that are worth discussing if they have become reality.

On copyright, the main reference is Hirtle et al. (2009). We have other useful material in Library of Congress Copyright Office (2010), Scott (1999) and Templeton (2008).

On XML, see the lecture notes of Thomas Krichel lis650 course, as well as the wonderfully short piece of Taylor (2004a). For more theoretical perspective, musings of Janish (2002) and Pitti (2002) are available.

For metadata, Taylor (2004b) gets it right again. Foulonneau and Reiley (2008) provides a general introduction. On METS we have a good background paper by McDonough (2006). Taylor (2004), Taylor (2004),

On the topic of Omeka, we mainly use its web site, but some favorable review by Kuzma, Reiss and Sidman (2010) may provide inspiration.

When we turn to digital repositories, we will mainly use Lynch (2003) Brown and Griffiths (2007) for background and DuraSpace (2011) for how to get things done.

To see how complicated persistent and digital preservation can be, a peek at Consultative Committee for Space Data Systems (2002)

On identifiers, see Kunze (2008) Lynch (2001) Berners-Lee (1998) and Tonkin (2008)

Last, but by no means least, there is a bunch of home-grown resources.

Mailing list

There is a mailing list for the course at All students are encouraged to subscribe. As a rule, answers to email sent to the instructor are copied to the list. There are exceptions to this rule:


Students have a quiz every class bar the first one. The worst quiz performances will be discounted. The average of all quizzes accounts for 5/13 all of the final grade. Students will give two 10 to 15 minute presentiations. Student upload their presentation materials to their web site (as opposed to their omeka sites). The first presentation is on the project plan on 2012–02–05. The presentation will count for 1/13 of the final grade. The second presentation introduces an omeka plugin of the students choice, but different for each student, on 2012–02–26. For this a roster is available, where the plugin per student is recorded on first-come-first-served basis. Supporting material for this presentation should be uploaded as user_id_project.ext where user_id is the user id of the student on wotan and ext is a conventional file extension. This presentation will count for 1/13 of the final grade. For the remaining 6/13 of the final grade, students prepare a small image-based digital library, of 20 to 100 images. The material must be of some cultural value, a purely personal collections is not admissible, even though it was in earlier editions of the course. The site has to be finished by 2012–03–04.

Wendy Ball-Attipoe web omeka plan plugin
Mikhail Livshits web omeka plan plugin
Melissa Oblitas web omeka plan plugin
Davide NM Roman web omeka plan plugin
Anne Hadley Young web omeka plan
Victoria Pilato web omeka plan plugin
Tania Silva web omeka plan plugin
Manija Mayel web omeka plan plugin
Christina Bell web omeka plan plugin
Keriann Barry web omeka plan
Kathleen Comeau web omeka plan plugin

Valid XHTML 1.0!