Strike Up the Score: Deriving Searchable and Playable Digital Formats from Sheet Music

Choudhury, GS, DiLauro, T, Droettboom, M, Fujinaga, I, and MacMillan, K (2001).

D-Lib Magazine 7(2):.

The Lester S. Levy Collection of Sheet Music represents one of the largest collections of sheet music available online. The Collection, part of the Special Collections of the Milton S. Eisenhower Library (MSEL) at Johns Hopkins University, comprises nearly 30,000 pieces of music which correspond to nearly 130,000 sheets of music and associated cover art. In the early 1990s, the MSEL considered the need for preservation of the Collection, while respecting the need for continued access. Accordingly, the MSEL evaluated two ideas to meet the dual goals of enhancing access while reducing the handling of the physical collection-microfilming and digitization. With funding from the National Endowment for the Humanities (NEH) in 1994, the Milton S. Eisenhower Library began the process of digitizing the Levy Collection. While there is now a reasonable amount of experience with digitization of library collections, this was not the case in 1994. Not only is the Levy Collection a relatively large online collection, it is also one of the first major digitization efforts by an academic research library. The mission of the second phase of the Levy project ("Levy II") can be summarized as follows: Reduce costs for large collection ingestion by creating a suite of open-source processes, tools and interfaces for workflow management; Increase access capabilities by providing a suite of research tools; and Demonstrate utility of tools and processes with a subset of the online Levy Collection. The cornerstones of the workflow management system include: optical music recognition (OMR) software to generate a logical representation of the score -- for sound generation, musical searching, and musicological research -- and an automated name authority control system to disambiguate names (e.g., the authors Mark Twain and Samuel Clemens are the same individual). The research tools focus upon enhanced searching capabilities through the development and application of a fast, disk-based search engine for lyrics and music, and the incorporation of an XML structure for metadata. Though this paper focuses on the OMR component of our work, a companion paper to be published in a future issue of D-Lib will describe more fully the other tools (e.g., the automated name authority control system and the disk-based search engine), the overall workflow management system, and the project management process.

