05/2/11

American Clyde


The concentration of shipyards that stretched along the Delaware River (from Wilmington north through Philadelphia, PA, Camden and Trenton, NJ) earned it the nickname “American Clyde”  after the  River Clyde in Scotland.   (read a contemporary account “American Clyde” from Harpers 1878).

Early shipwrights tended to be a few well known people, but as shipbuilding became more capital intensive it gave rise to companies and corporations that could organize finances, labor forces and materials for larger efforts.

I’m already noticing that that my RDF creation will be an iterative process – I can’t associate a person with a company until I’ve figured out what that company’s URI will be.  Here’s where working with a system would come in handy – thought I’m not sure how they solve the chicken-and-egg problem of referring to an entity until at URI is minted.   Having a clear convention for URIs may be one way around that problem.  At the moment I’m just minting hash URIs based on names. Personal (firstname_lastname) URIs are a little easier,  so I can link corporate records to future people records.

URIs based on names obviously create a problem for companies that merge, divide, incorporate, etc. This problems seems to lend credence to creating opaque URIs at some point.  For now, I’ll stick with the convention of names in my URIs, but will try to link my records with existing Freebase URIs if they exist.

How to do that well may be a bit of a problem. This example graph for Pusey and Jones from Freebase exhibits many of the problems outlined in Halpin and Hayes (2010) When owl:sameAs isn’t the Same: An Analysis of IdentityLinks on the Semantic Web.   While there is a continuation from the earlier Pusey and Jones Company to the later Pusey and Jones Corporation,  these two entities are separated in time and legal status.  Chasing down these differences is one of the fun parts  of archival research (says the masochist in me).  While you might see the difference only in a preferred form of a name, changes in name may also be involved with change of location, change of business type, etc.   I’ve taken the easy route here – minor changes of name and leadership have been left as a single entity (using the previous names property to record changes).  Major acquisitions require a new record (with the two entities linked).  Later I’d like to come back to the this question through the eyes of EAC-CPF, which may be better tuned for these kinds of subtle changes. Also, how complicated you want to make this probably depends on what you’d like to do with the RDF. Freebase/Wikipedia/dbPedia take a pretty high-level approach, which may mean that it will be of limited use for certain kinds of analysis.

Despite these problems,  the Freebase properties still seem like a place to start since they have properties that will link to legal/conceptual entities together.  Many of the available properties are listed in Freebase, but they haven’t been completed for Pusey & Jones.  I played around a little with the Freebase editing and even added a few values,  but in the end created my own RDF graphs for most of the major Wilmington shipyards.  These are pretty simple stubs, with names, start and end dates and references to individual founders (they still need to be added to the people file).

Oh, one last thing.  Validate, validate, validate.  I caught a few minor errors by running this through the W3C RDF Validation service.