This is a two part blog post from Joy Nelson and Nick Clemens who recently attended the Koha Hackfest in Marseille France.
I gave a brief presentation on linked data and the benefits and potential challenges inherent in moving towards a linked data system. It was based loosely on the presentation given at KohaCon 2016 which can be viewed here: https://www.slideshare.net/talljoy/kohacon2016
We discussed some of the challenges in linked data. One of those is “Choice of Linked Data Vocabulary”. In the U.S. the Library of Congress is developing BibFrame as a MARC replacement. Ideally, the system created would be ‘vocabulary agnostic’. The Koha linked data system should allow users the ability to utilize one of many vocabularies or *any* vocabulary they wish. (bibframe, bf:lite, schema.org, dublin core, etc)
Identifying the purpose of Linked Data is another area that needs discussion and some consensus. How do libraries envision themselves using this kind of data/system?
– Do we create triplestores converted from MARC to store and allow those stores to be crawled? Thereby making a collection visible on the web in search results.
– Do we concentrate on creating a system that allows for libraries to connect their data to other libraries, data stores external or internally? For example, Is the goal to link our authors and subject tags to URIs on the web that we can pull in to Koha and display more information to the user?
– Do we create linked data URIs for special libraries (museums, academic institutions) to provide to other institutions and allow others to link into their system to gather data to display to their users?
Linked Data is a fundamental shift in how we think about our data and how we will use that data. It is a much more cooperative venture in cataloging that will provide benefits to both staff and patrons.
I presented the basic work I have done on implementing a triplestore backend for Koha which can be found on github here:
The code is very basic so far and does a few things:
1 – Implements Koha::RDF::Store as a base for which all individual backend implementations will be built
2 – Adds a beginning implementation of RDF::Trine to allow storage and retrieval of RDF triples
3 – A very basic form on the intranet details page to enter a predicate and subject (assuming the URL of the details page) as the object
The code is a start and begins to raise a few of concerns which I see and what we need to do to continue. I am very interested in developing Koha/RDF in a way that is backend agnostic and allows for differing front end implementations.
The next steps I see are:
1 – We must agree on a common set of subroutines that any backend will support. They will require basic CRUD and options for retrieving in various formats
2 – We will need to develop endpoints: A SPARQL endpoint, and REST endpoints for CRUD, and OAI endpoints
3 – We need to decide on how we will mint and serve URIs – we could continue to use the current details URLs and add parameters for format to supply triples or xml, however, for things like works and authors we will need some method of linking. Links with parameters are not as ‘cool’ as named or numbered URIs but provides the ability for linking our data out to other resources, making it crawlable, and internally linking.
Once we can agree on building a base together, than we can all proceed to develop in different directions as we see fit and provide various options that can all be pursued without conflict or duplicated effort.