Since the #lodlam conference, I haven’t had much chance to play around with my shipyard LOD — the dissertation calls. Plus I’m spending about half my time this summer as part of the team working on the CLIR/DLF/IMLS DCC Beta Sprint for the Digital Public Library of America (DPLA).
What follows is a bit of skunkworkery that I’m doing for self-edification & also to help suggest ways we can make IMLS DCC data more LOD friendly.** Currently people can browse the site at http://imlsdcc.grainger.illinois.edu/history or as XML via OAI-PMH for collection-level and item-level metadata. As part of the Collection/Item Metadata Working Group (CIMR), I helped build an RDF testbed that was oriented towards our research problems.
Using some of the stylesheets developed for CIMR, I’ve generated LOD representations for the currently available collection-level records. When the rubber hits the road like this, there are lots of design choices you can make – in terms of encodings, which vocabularies to use, etc., etc. Here is a sample set of records and the XSLT used to generate them from the OAI-PMH, imlsdcc listRecords format. Some questions:
- this looks rather complicated. Maybe that’s OK, as it seems to represent much of the information currently shared publicly by the project. I’d welcome any suggestions for simplifications or better approaches to representing this as LOD.
- are there best practices for representing organizations as organizations? FOAF/vCard seem very oriented towards people (who have associations with an organization). I also picked up the Organization Ontology from Describing Libraries, Their Collections and Services in RDF.
- Many of the URIs here are just made up for demonstration purposes.
- There are lots of organizations we have minimal information for. It would be nice to reconcile our URIs with other published URIs for these institutions. What would be the most authoritative source for that LOD?
- Many organizations aren’t publishing their own “authorized” graphs for themselves. Is this something a project like IMLS DCC should consider? I added a stub description of IMLS DCC to this file to demonstrate the relationships between the project and the aggregated collections.
- Right now this RDF mostly contains the strings found in the original XML. I would like to reconcile controlled terms where possible to existing LOD vocabularies (like id.loc.gov, language terms, formats, etc.). I think that would make this data more “linked.”
- In theory the XSLT above should still work with the SIMILE OAI-PMH RDFizer
Thanks if you have a chance to take a look and offer comments on this. And do let me know if you’d like to see more of this kind of data!
** Disclaimer: this is some work I’m doing on the side, on my own. Neither the rdf nor the XSLT should be considered an “official” release by the project. Any mistakes here are mine.