Inherent Vice

Exploring the world of digital cultural heritage.

Menu

Skip to content
  • Home
  • About

Tag Archives: OAI-PMH

07/10/11

OAIpen Sesame!

I’ve been working to transform some OAI Dublin Core I have into RDF and load it into a local Sesame repository. This has been immensely easier thanks to Jeni Tennison’s Getting Started with RDF and SPARQL Using Sesame and Python.

I acquired the records using another helfpul tool, the SIMILE OAI-RDFizer. You can modify the XSLT transformers built into the tool pretty easily to do whatever kind of output you’d like, but the default transformers take the OAI_DC XML and output RDF XML.

One of the challenges of working with OAI (well, any linked data I guess..) is that you don’t always know how much is there when you start. To handle the open-ended nature of OAI-PMH, oai2rdf hashes a directory structure to store all the ListRecord files it gets back. Jeni’s Python example assumed that you’d have a single file of RDF, not hundreds (thousands!). I added an os.walk loop to iterate through all the oai2rdf directories looking for the RDF files.

Because Jeni’s example expected to just load a single file it used the Sesame PUT method. After I got the os.walk working, I realized that PUT only loads the file at hand and replaces any previous data that was loaded. I changed this to use the POST method so each file is appended to the store.

In order to prevent crashing Sesame, I did need to add a short delay at the end of the loop. I’m still playing around with a localhost version of Sesame and haven’t quite figured out how much I can throw at it on my laptop without causing Sesame to blow up. The localhost version I’m running has been fine for testing things out, but I’m thinking I’ll move this to an AWS instance once I iron out where this is going.

Full code after the jump.
Continue reading →

Posted in Linked Open Data, metadata | Tagged OAI-PMH, Python, rdf, Sesame | Leave a comment

Post navigation

Search

Recent Posts

  • Introducing LODLAM Patterns
  • Reconciling Museums Count
  • Quick Museum Counts update
  • What is a Library/Archive/Museum According to Linked Data?
  • Hacking Museums Count

RSS @Musebrarian on Twitter

Tags

"wilmington alise09 alise2009 asist2010 CDWA CHNM CIDOC-CRM Collection Dashboard collections ContentDM cultural heritage dapper dc-2008 dc2008 delaware dissertation dublin core election Freebase glamWiki grounded theory hackforchange IMLS Digital Collections and Content iSchools09 LAM Linked Data LODLAM memory institutions metadata methods mw2009 ndoch OAI ontology openGLAM OWL Patchwork Prototyping phd qualitative data analysis semantic web shipbuilding URI voting XML Yahoo! Pipes

<div> of shameless commerce

  • Organic Men's Fitted T-Shirt $23.99
  • Value T-shirt $10.99
  • Dark T-Shirt $20.00
  • Women's Cap Sleeve T-Shirt $20.99
  • Women's Dark T-Shirt $21.99
CafePress.com SHOPSHOP

Past Posts

© 2013 Inherent Vice | Proudly powered by WordPress | Theme: Skirmish by Blank Themes

Switch to our mobile site