Some time ago (within the Organic.Edunet project that ended in late 2010) I implemented a mapping of IEEE LOM to the Dublin Core Abstract Model (thanks to my colleague Mikael Nilsson who did this work) and also applied it to some bigger data sources such as the repository of the ARIADNE foundation. This post is basically a write-up of what I produced and in which ways it can be used/exploited by others. This is about public data (namely metadata describing educational resources) and a contribution to the LOD cloud.
The software used for harvesting, converting, and exposing are the SCAM framework and the Confolio frontend (I’ll post links to these applications as soon as we have some usable documentation online). Organic.Edunet uses the SCAM/Confolio combination natively so a conversion as with ARIADNE was not necessary in the case of this project.
Harvested repositories exported to RDF files
One RDF file per harvested repository is exported every Monday morning and can be downloaded from: http://knowone.csc.kth.se/rdf/.
Some quick facts about the exports:
- This is a direct and metadata-only export from SCAM which means that only educational metadata is included here. I.e. I assume that the ACL, provenance, etc information from SCAM is not of general interest so I don’t include it in the export.
- SCAM supports contextualized annotation of resources, that’s why all metadata is stored in separate graphs with a URI. This basically means that the same resource (identified by a URI which you can use to download/refer to/view the resource) can be described by different graphs, e.g. in different educational contexts, or simply with some generic metadata and educational metadata on top of it.
- The format it TriG, mainly because it has the needed support for including information about Named Graphs (and because it is a nice format
).
- Of course you can import the RDF data into your own triple store and provide/use your own SPARQL endpoint (or use it in a completely different way). If you want to play around before you do this you can also access the public SPARQL endpoint as described below.
SPARQL endpoint to query educational metadata
Using Sesame and Snorql I created a public SPARQL endpoint with a simple interface which you can use to play around. If you intend to use this more heavily in own applications you are kindly asked to setup your own endpoint to avoid hitting the server too hard. The endpoint currently supports SPARQL 1.0.
- SPARQL endpoint (for machines): http://knowone.csc.kth.se/sparql/ariadne-big
- A simple interface to it (for humans): http://knowone.csc.kth.se/snorql/
The endpoint currently only contains quadruples generated from the ARIADNE repository. I am thinking of providing also separate endpoints for other repositories, I only have to find a way of maintaining this in a way which it not too much effort. The graph URIs you can request in SPARQL (using the GRAPH keyword, see some examples below) can also be queried using plain HTTP. With content negotiation (either through the HTTP header) you can request different RDF serializations, JSON, and LOM/XML.
Examples for such URIs for a GET request look like this:
- http://knowone.csc.kth.se/scam/9/cached-external-metadata/41081 (the default returns RDF/XML and is the same as requesting it with “?format=application/rdf+xml”)
- http://knowone.csc.kth.se/scam/9/cached-external-metadata/41081?format=application/json
- http://knowone.csc.kth.se/scam/9/cached-external-metadata/41081?format=application/lom+xml
The response of these requests is a full metadata instance describing a resource, the root of the graph is the resource URI. You can get some administrative information by replacing “cached-external-metadata” by “entry”. If you had username/password an access rights to the SCAM installation you could also perform a PUT in order to update the metadata.
Enough about this, let’s get the hands dirty with some…
SPARQL query examples
Because it is always easier to build upon and modify existing things you can have a look at the following queries that have been shown to work. Preceding the query is a short description what it actually does. Depending on the query you will also need some of the following prefixes. They are predefined in the web interface, but you have to provide them yourself if you send the SPARQL query using an different client:
Prefixes
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX lom: <http://ltsc.ieee.org/rdf/lomv1p0/lom#> PREFIX lomvoc: <http://ltsc.ieee.org/rdf/lomv1p0/vocabulary#> PREFIX lomterms: <http://ltsc.ieee.org/rdf/lomv1p0/terms#> PREFIX lre: <http://organic-edunet.eu/LOM/rdf/voc#> PREFIX oe: <http://organic-edunet.eu/LOM/rdf/>
Find all contexts describing a resource
SELECT DISTINCT ?g WHERE { GRAPH ?g { <http://www.orgfoodfed.com/> ?p ?o } }
Find all resources from FAO in English language
SELECT DISTINCT ?uri ?title ?g WHERE { GRAPH ?g { ?uri dcterms:title ?title ; dcterms:language ?b . ?b rdf:value "en"^^dcterms:RFC4646 . FILTER(regex(str(?uri), "^http://www.fao.org/", "i")) } } LIMIT 10
Get all available types from the repository
SELECT DISTINCT ?types WHERE { ?s rdf:type ?types } ORDER BY ?types
Search including various LOM elements, language, and regexp
SELECT ?context ?uri ?title ?desc WHERE { GRAPH ?context { ?uri dcterms:title ?title ; dcterms:audience lomvoc:IntendedEndUserRole-learner ; lom:copyrightAndOtherRestrictions "false"^^xsd:boolean ; lom:status lomvoc:Status-final ; dcterms:description ?b . ?b rdf:value ?desc . FILTER( langMatches(lang(?title),"en") && langMatches(lang(?desc),"en") && regex(str(?title), "^organic", "i") ) } }
I’ll probably post some more examples soon, I provided the examples above only to get the interested ones started. I also hope to be able to upgrade to SPARQL 1.1 soon (the Sesame guys are working on it; looking forward to the next release!), which provides a more complex query syntax/logic.
