Navigation

Wednesday, January 12, 2005

On View Definitions in SeRQL

Leigh Dodds sketches a need for View Definitions in Triplestores and asks what APIs and triplestore possibly support that:
The scenario I'm imagining is this: I've obtained a reference to a resource in an RDF graph, perhaps via a query. From this node in the graph I want to pull out a subset of the data accessible by navigating from this resource. I can see two possible ways to create that subset, and these may be used in conjunction. The first is to apply a "window" on the graph, and only extract the data that's within a certain distance of my origin.
Using a SeRQL construct query one can specify exactly such a subgraph, e.g. assume we have a resource 'my:x', and we wish to retrieve all connected nodes and edges (with a distance of 1): construct * from {i} p {my:x} q {j} Fairly straighforward, and quite flexible (in fact the construct clause allows not just subgraph selection but also graph transformation). Sesame has API methods that facilite creating an on-the-spot graph object from the result of such a query and manipulating it further. Leigh also mentions querying on namespace. Currently this is not explicitly supported in SeRQL but a string comparison operator, 'like', can be used to match specific namespaces in property names (or any URI for that matter): construct * from {i} p {my:x} q {j} where p like "foo://mynamespace/*" An extension of SeRQL that has been on my wish list for a while is a specific namespace() function. Not only does that look nicer, it makes implementation of namespace-based constraining much more efficient, since it eliminates the need for expensive string comparisons. So, the building blocks are definitely there, but SeRQL currently offers no specific support for view definitions in the sense of databases: this would require a naming scheme for queries (so that other queries can refer to particular views), and a dynamic mapping between the intensional view and the actual source. However, it is possible to write a convenience class MyView that, on access, fires a SeRQL construct query, and uses the retrieved subgraph to facilitate a view, and this comes quite close to a view definition scenario.

Tuesday, January 04, 2005

Automated metadata extraction

Aduna's Metadata Server has just been released recently. This is a Sesame-based server tool that enables automatic metadata extraction and storage (in RDF) from information sources, such as file systems, websites, etc. It supports a host of file formats, including HTML, MS Office, Open Office, PDF, etc. The Metadata server is the server-side accompaniment to client- and web-based frontends such as Autofocus and Spectacle, but the stored data is exposed through Sesame's access APIs, and thus in principle any tool can talk to the server. At the moment, the metadata schema is fairly simple: it introduces about 9 properties, such as author, title, keyword, etc. (it does not reuse Dublin Core but rather creates subproperties of dc properties. The reason is that the use in Metadata server is more restrictive than in Dublin Core).

Monday, January 03, 2005

T minus two weeks

Happy new year everyone. It's funny, I've spent just about the best holiday week in years (with my girlfriend), yet I've spent most of it... working. My PhD thesis is due in two weeks - or actually, it was due last week, officially. But it's almost there now... "Storage, Querying and Inferencing for Semantic Web Languages". Nify title no? All but chapter 8 and the introduction and conclusion are done, so I'm getting ready for the final final spurt now... After I've gone and gotten me some tea.