For those of you who may not have seen the news, I’m pleased to announce that I successfully defended my dissertation proposal titled Principle Violations: Revisiting the Dublin Core 1:1 Principle last week. Here’s a quick elevator rundown:
Libraries, archives and museums have adopted Dublin Core for the exchange of metadata using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Several studies suggest that these records routinely “violate” something known as the 1:1 Principle. However, each study has developed its own “high level assumptions” about what counts as a violation and how to identify violating records (e.g. Hutt & Riley, 2005).
Interviews with metadata creators (Park & Childress, 2009) finds that they are frequently confused about exactly what the 1:1 Principle is and how they should apply it in their local environments. While you may think you know what it’s about, I’ve found that there are at least two senses of the Principle that seem to get confused.
1) a pretty straightforward relationship between a resource and a description. Essentially what the DCMI Abstract Model says: each description describes one, and only one resource.
2) a more complex set of relationships between metadata records about particular kinds of resources, e.g. things like “originals” and things like “reproductions”
For those of us setting out to create new metadata, I think the Abstract Model is spot on. Unfortunately, it may not help us detect violations in OAI-PMH metadata records that lacks the semantic infrastructure to indicate when there are one or more descriptions let alone descriptions about more than one resource. The dissertation will provide a conceptual definition that harmonizes these two senses of the Principle and will formalize rules and techniques for identifying obvious cases when OAI-PMH records describe more than one resource. I’ll be using the IMLS DCC Opening History aggregation as a source of OAI-PMH records.
The more interesting portion of the conceptual analysis will explore the latter set of relationships, in part because “the problem of defining reproductions in relationship to originals has proven elusive through all the cataloging codes of the twentieth century” (Knowlton, 2009). I think the Using Dublin Core account of the 1:1 Principle, in using the Mona Lisa as an example, conflates some deeper issues. Few of us in the cultural heritage sector believe that a jpeg reproduces a unique work of art; instead the digital image stands in some other kind of relationship with the artwork it represents (for example, a derivative or descriptive relationship). These “other” categories of relationships deserve some additional attention as we adopt models of the bibliographic universe, like FRBR, for our metadata. The 1:1 Principle, then, becomes a gateway to explore some difficult questions. I expect this work will not merely provide commentary on the known confusions about the principle or add to the fingers pointing at bad metadata. Rather, I expect it will help us better understand this rich set of relationships and the challenges faced by knowledge organization systems trying to represent them.
I will be attending the last day of the DC 2010 and will also be presenting my work at the ASIS&T Poster session and the MCN 2010 if you are interested in talking more about my work.