Sponger ontology mappers peform the the task of generating RDF instance data from extracted metadata (non-RDF) using ontologies associated with a given data source type. They are typically XSLT (using GRDDL or an in-built Virtuoso mapping scheme) or Virtuoso PL based. Virtuoso comes preconfigured with a large range of ontology mappers contained in one or more Sponger cartridges. Nevertheless you are free to create and add your own cartridges, ontology mappers, or metadata extractors.
Figure 9: Sponger architecture
Below is an extract from the stylesheet /DAV/VAD/rdf_cartridges/xslt/flickr2rdf.xsl, used for extracting metadata from Flickr images. Here, the template combines RDF metadata extraction and ontology mapping based on the FOAF and Dublin Core ontologies.
<xsl:template match="owner">
<rdf:Description rdf:nodeID="person">
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/#Person" />
<xsl:if test="@realname != ''">
<foaf:name><xsl:value-of select="@realname"/></foaf:name>
</xsl:if>
<foaf:nick><xsl:value-of select="@username"/></foaf:nick>
</rdf:Description>
</xsl:template>
<xsl:template match="photo">
<rdf:Description rdf:about="{$baseUri}">
<rdf:type rdf:resource="http://www.w3.org/2003/12/exif/ns/IFD"/>
<xsl:variable name="lic" select="@license"/>
<dc:creator rdf:nodeID="person" />
...
Once a Sponger cartridge has been developed it must be plugged into the SPARQL engine by registering it in the Cartridge Registry, i.e. by adding a record in the table DB.DBA.SYS_RDF_MAPPERS, either manually via DML, or more easily through Conductor (Virtuoso's browser-based administration console), which provides a UI for adding your own cartridges. Sponger configuration using Conductor is described in detail later. For the moment, we'll focus on outlining the broad architecture of the Sponger.
The SYS_RDF_MAPPERS table definition is as follows:
create table DB.DBA.SYS_RDF_MAPPERS (
RM_ID integer identity, -- cartridge ID, designate order of execution
RM_PATTERN varchar, -- a REGEX pattern to match URL or MIME type
RM_TYPE varchar default 'MIME', -- which property of the current resource to match: MIME or URL
RM_HOOK varchar, -- fully qualified PL function name e.g. DB.DBA.MY_CARTRIDGE_FUNCTION
RM_KEY long varchar, -- API specific key to use
RM_DESCRIPTION long varchar, -- Cartridge description (free text)
RM_ENABLED integer default 1, -- 0 or 1 integer flag to include or exclude the given cartridge from Sponger processing chain
RM_OPTIONS any, -- cartridge specific options
RM_PID integer identity, -- for internal use only
primary key (RM_HOOK)
);
The Virtuoso SPARQL processor supports IRI dereferencing via the Sponger. Thus, if the SPARQL query contains references to non-default graph URIs the Sponger goes out (via HTTP) to grab the RDF data sources exposed by the data source URIs and then places them into local storage (as Default or Named Graphs depending on the SPARQL query). Since SPARQL is RDF based, it can only process RDF-based structured data, serialized using RDF/XML, Turtle or N3 formats. As a result, when the SPARQL processor encounters a non-RDF data source, a call to the Sponger is triggered. The Sponger then locates the appropriate cartridge for the data source type in question, resulting in the production of SPARQL-palatable RDF instance data. If none of the registered cartridges are capable of handling the received content type, the Sponger will attempt to obtain RDF instance data via the in-built WebDAV metadata extractor.
Sponger cartridges are invoked during the aforementioned pipeline as follows:
When the SPARQL processor dereferences a URI, it plays the role of an HTTP user agent (client) that makes a content type specific request to an HTTP server via the HTTP request's Accept headers. The following then occurs:
Figure 10: Sponger cartridge invocation flowchart
info@elgg.org
Security issues should be reported to security@elgg.org!
©2014 the Elgg Foundation
Elgg is a registered trademark of Thematic Networks.
Cover image by Raül Utrera is used under Creative Commons license.
Icons by Flaticon and FontAwesome.