RDF in Mozilla FAQ

General

Where do I start?

RDF serves two primary purposes in Mozilla. First, it is a simple, cross-platform database for small data stores. Second, and more importantly, the RDF model is used along with XUL templates as an abstract "API" for displaying information. RDF in Fifty Words or Less is a quick, high-level description of what RDF does in Mozilla. The RDF Back-end Architecture document describes in more detail how Mozilla's RDF implementation works, and gives a quick overview of the interfaces that are involved.

Where do I find information on Open Directory ("dmoz")?

Unfortunately, not here! Well, here's a little... Start at http://www.dmoz.org/ for more information on the Open Directory. The Open Directory dataset is available as a (huge) RDF/XML dump. It describes thousands of Web sites using a mix of the Dublin Core metadata vocabulary and the DMoz taxonomy. See their RDF pages for more information.

If you have problems with the DMoz and ChefMoz data, it's best to contact those projects directly. But if you do something interesting with this content (eg. have Mozilla use it, eg. by loading chunks of it into a XUL UI from a remote site), don't forget to let mozilla-rdf and the RDF Interest Group lists know. Those lists would probably also be interested in tools for cleaning / re-processing / storing the DMoz data too. See the sites using ODP data pages for some directories based on the ODP RDF dumps.

What is a datasource?

RDF can generally be viewed in two ways: either as a graph with nodes and arcs, or as a "soup" of logical statements. A datasource is a subgraph (or collection of statements, depending on your viewpoint) that are for some reason collected together. Some examples of datasources that exist today are "browser bookmarks", "browser global history", "IMAP mail accounts", "NNTP news servers", and "RDF/XML files".

In Mozilla, datasources can be composed together using the composite data source. This is like overlaying graphs, or merging collections of statements ("microtheories") together. Statements about the same RDF resource can then be intermingled: for example, the "last visit date" of a particular website comes from the "browser global history" datasource, and the "shortcut keyword" that you can type to reach that website comes from the "browser bookmarks" datasource. Both datasources refer to "website" by URL: this is the "key" that allows the datasources to be "merged" effectively.

For a more detailed account of how to write a datasource, refer to the RDF Datasource How-To.

How does Mozilla manage datasources?

The RDF service manages a table of all loaded datasources. The table is keyed by the datasource's "URI", which is either the URL of an RDF/XML file, or a "special" URI starting with rdf: that refers to a built-in datasource.

Datasources may be loaded via the RDF service using the GetDataSource() method. If the URI argument refers to an RDF/XML file's URL, then the RDF service will create an RDF/XML datasource and asynchronously parse it. The datasource will remain "cached" until the last reference to the datasource is released.

If the URI argument refers to a built-in datasource, the RDF service will use the XPCOM Component Manager to load a component whose ContractID is constructed using the "special" URI and the well-known prefix@mozilla.org/rdf/datasource;1?name=</code>.

For example,

rdf:foo

Would load:

@mozilla.org/rdf/datasource;1?name=foo

As with RDF/XML datasources, a datasource that is retrieved this way is "cached" by the RDF service until the last reference is dropped.

How do I create a datasource from an RDF/XML file?

You can either create an RDF/XML datasource using the RDF service's GetDataSource() method:

// Get the RDF service
var RDF =
  Components
  .classes["@mozilla.org/rdf/rdf-service;1"]
  .getService(Components.interfaces.nsIRDFService);
// ...and from it, get the datasource. Make sure that your web server
// dishes it up as text/xml (recommended) or text/rdf!
var ds = RDF.GetDataSource("http://www.mozilla.org/some-rdf-file.rdf");
// Note that ds will load asynchronously, so assertions will not
// be immediately available

Alternatively, you can create one directly using the XPCOM Component Manager, as the following code fragment illustrates:

// Create an RDF/XML datasource using the XPCOM Component Manager
var ds =
  Components
  .classes["@mozilla.org/rdf/datasource;1?name=xml-datasource"]
  .createInstance(Components.interfaces.nsIRDFDataSource);
// The nsIRDFRemoteDataSource interface has the interfaces
// that we need to setup the datasource.
var remote =
   ds.QueryInterface(Components.interfaces.nsIRDFRemoteDataSource);
// Be sure that your web server will deliver this as text/xml (recommended) or text/rdf!
remote.Init("http://www.mozilla.org/some-rdf-file.rdf");
// Make it load! Note that this will happen asynchronously. By setting
// aBlocking to true, we could force it to be synchronous, but this
// is generally a bad idea, because your UI will completely lock up!
remote.Refresh(false);
// Note that ds will load asynchronously, so assertions will not
// be immediately available

You may decide that you need to "manually" create an RDF/XML datasource if you want to force it to load synchronously.

How do I reload an RDF/XML datasource?

You can force an RDF/XML datasource (or any datasource that supports nsIRDFRemoteDataSource) to reload using the Refresh() method of nsIRDFRemoteDataSource. Refresh() takes a single parameter that indicates whether you'd like it to perform its operation synchronously ("blocking") or asynchrounously ("non-blocking"). You should never do a synchronous load unless you really know what you're doing: this will freeze the UI until the load completes!

How can I tell if an RDF/XML datasource has loaded?

Using the nsIRDFRemoteDataSource interface, it is possible to immediately query the loaded property to determine if the datasource has loaded or not:

// Get the RDF service
var RDF =
  Components
  .classes["@mozilla.org/rdf/rdf-service;1"]
  .getService(Components.interfaces.nsIRDFService);
// Get the datasource.
var ds = RDF.GetDataSource("http://www.mozilla.org/some-rdf-file.rdf");
// Now see if it's loaded or not...
var remote =
  ds.QueryInterface(Components.interfaces.nsIRDFRemoteDataSource);

if (remote.loaded) {
  alert("the datasource was already loaded!");
}
else {
  alert("the datasource wasn't loaded, but it's loading now!");
}

Say the datasource wasn't loaded, and is loading asynchronously. Using this API and JavaScript's setTimeout(), one could set up a polling loop that would continually check the loaded property. This is kludgy, and even worse, won't detect a failed load, for example, if there wasn't any data at the URL!

For this reason, there is an observer interface that allows you to spy on a datasource's progress. The following code illustrates its usage:

// This is the object that will observe the RDF/XML load's progress
var Observer = {
  onBeginLoad: function(aSink)
    {},

  onInterrupt: function(aSink)
    {},

  onResume: function(aSink)
    {},

  onEndLoad: function(aSink)
    {
      aSink.removeXMLSinkObserver(this);
      alert("done!");
    },

  onError: function(aSink, aStatus, aErrorMsg)
    { alert("error! " + aErrorMsg); }
};
// Get the RDF service
var RDF =
  Components
  .classes["@mozilla.org/rdf/rdf-service;1"]
  .getService(Components.interfaces.nsIRDFService);
// Get the datasource.
var ds = RDF.GetDataSource("http://www.mozilla.org/some-rdf-file.rdf");
// Now see if it's loaded or not...
var remote =
  ds.QueryInterface(Components.interfaces.nsIRDFRemoteDataSource);

if (remote.loaded) {
  alert("the datasource was already loaded!");
}
else {
  alert("the datasource wasn't loaded, but it's loading now!");
  // RDF/XML Datasources are all nsIRDFXMLSinks
  var sink =
    ds.QueryInterface(Components.interfaces.nsIRDFXMLSink);
  // Attach the observer to the datasource-as-sink
  sink.addXMLSinkObserver(Observer);
  // Now Observer's methods will be called-back as
  // the load progresses.
}

Note that the observer will remain attached to the RDF/XML datasource unless removeXMLSinkObserver is called.

How do I access the information in a datasource?

The nsIRDFDataSource interface is the means by which you access and manipulate the assertions in a datasource.

  • boolean HasAssertion(aSource, aProperty, aTarget, aTruthValue).
    This tests the datasource to see if it has the specified tuple.
  • nsIRDFNode GetTarget(aSource, aProperty, aTruthValue).
  • nsISimpleEnumerator GetTargets(aSource, aProperty, aTruthValue).
  • nsIRDFResource GetSource(aProperty, aTarget, aTruthValue).
  • nsISimpleEnumerator GetSources(aProperty, aTarget, aTruthValue).
  • nsISimpleEnumerator ArcLabelsIn(aTarget).
  • nsISimpleEnumerator ArcLabelsOut(aSource).

You can also use the RDF container interfaces to access information contained in RDF containers.

How do I change information in the datasource?

To use 'Assert' to add one assertion and 'Unassert' to remove one. Refer to Mozilla RDF Back end Architecture

ds.Assert(homepage, FV_quality, value, true);
ds.Unassert(homepage, FV_quality, value, true);

How do I save back changes to an RDF/XML datasource?

An RDF/XML datasource can be QueryInterface()'d to nsIRDFRemoteDataSource. This interface has a Flush() method, which will attempt to write the contents of the datasource back to the URL from which they were loaded, using a protocol specific mechanism (e.g., a file: URL just writes the file; an http: URL might do an HTTP-POST). Flush() only writes the datasource if its contents have changed.

How do I merge two or more datasources to view them as one?

Use nsIRDFCompositeDataSource. This interface is derived from nsIRDFDataSource. An implementation of this interface will typically combine the statements from several datasources together as a collective. Because the nsIRDFCompositeDataSource interface is derived from nsIRDFDataSource, it can be queried and modified just like an individual data source.

How do I access "built-in" datasources?

A built-in datasource is a locally-installed component that implements nsIRDFDataSource. For example, the bookmarks service. First, check here to make sure you will be allowed to access a built-in datasource. There are several security restrictions on accessing built-in datasources from "untrusted" XUL and JS.

Since a built-in datasource is just an XPCOM component, you can instantiate it directly using the XPConnect component manager.

// Use the component manager to get the bookmarks service
var bookmarks =
  Components.
  classes["@mozilla.org/rdf/datasource;1?name=bookmarks"].
  getService(Components.interfaces.nsIRDFDataSource);

// Now do something interesting with it...
if (bookmarks.HasAssertion(
     RDF.GetResource("http://home.netscape.com/NC-rdf#BookmarksRoot"),
     RDF.GetResource("http://home.netscape.com/NC-rdf#child"),
     RDF.GetResource("http://home.netscape.com/NC-rdf#PersonalToolbarFolder"),
     true) {
  // ...
}

Alternatively, some datasources have "special" RDF-friendly ContractIDs that make it easy to instantiate the datasource using either the nsIRDFSerivce's GetDataSource() method or the datasources attribute on a XUL template. These ContractIDs are of the form

@mozilla.org/rdf/datasource;1?name=name

And are accessable via GetDataSource() and the datasources attribute using the shorthand rdf:name. For example, the following XUL fragment illustrates how to add the bookmarks service as a datasource into a XUL template.

<tree datasources="rdf:bookmarks">
  ...
</tree>

How do I manipulate RDF "containers"?

To manipulate an RDF "container" (an <rdf:Seq>, for example), you can use nsIRDFContainerUtils which can be instantiated as a service with the following ContractID:

@mozilla.org/rdf/container-utils;1

You can use this service to detect if something is an RDF container using IsSeq(), IsBag(), and IsAlt(). You can "make a resource into a container" if it isn't one already using MakeSeq(), MakeBag(), or MakeAlt(). These methods return an nsIRDFContainer which allows you to do container-like operations without getting your hands too dirty.

Alternatively, if your datasource already has an object that is an RDF container, you can instantiate an nsIRDFContainer object with the

@mozilla.org/rdf/container;1

ContractID and Init() it with the datasource and resource as parameters. Note that this will fail if the resource isn't already a container.

XUL Templates

XUL templates are created by specifying a datasources attribute on an element in a XUL document.

There are two "forms" that XUL templates may be written in. The "simple" form, which is currently the most common form in the Mozilla codebase, and the "extended" form, which allows for sophisticated pattern matching against the RDF graph. See the XUL:Template Guide. (These are bizarrely arranged because the eventual intent is to introduce templates using the extended form -- which in many ways is conceptually simpler, even if more verbose -- and then treat the "simple" form as a shorthand for the extended form.)

What can I build with a XUL template?

You can build any kind of content using a XUL template. You may use any kind of tag (including HTML, or arbitrary XML) in the <action> section of a <rule>.

When should I use a XUL template?

One alternative to using RDF and XUL templates is to use the W3C DOM APIs to build up and manipulate a XUL (or HTML) content model. There are times, however, when doing so may be inconvenient:

  1. There are several "views" of the data. For example, Mozilla mail/news reveals the folder hierarchy in the toolbar, the "folder pane", in several menus, and in some of the dialogs. Rather than writing three pieces of JS (or C++) code to construct the DOM trees each for <menubutton>, <menu>, and <tree> content models, you write three compact sets of rules for each content model.
  2. The data can change. For example, a mail/news user may add or remove IMAP folders. (Note how this requirement complicates the task of building a content model!) The XUL template builder uses the rules to automatically keep all content models in sync with your changes.

In order to take advantage of this functionality, you must of course be able to express your information in terms of the RDF datasource API, either by using the built-in memory datasource, using RDF/XML to store your information, or writing your own implementation (possibly in JavaScript) of nsIRDFDataSource.

What gets loaded when I specify "datasources="

The datasources attribute on the "root" of a template specifies a whitespace-separated list of datasource URIs to load. But what is a "datasource URI"? It is either:

  • An abbreviated ContractID for a locally installed component. By specifying rdf:name, you instruct the template builder to load the XPCOM component with the ContractID @mozilla.org/rdf/datasource;1?name=name.
  • The URL of an RDF/XML file. For example,
    file:///tmp/foo.rdf
    chrome://mycomponent/content/component-data.rdf
    http://www.mysite.com/generate-rdf.cgi
    ftp://ftp.somewhere.org/toc.rdf
    

    The load will be processed asynchronously, and as RDF/XML arrives, the template builder will generate content.

In either case, the datasource will be loaded using the nsIRDFService's GetDataSource() method, so it will be managed similarly to all other datasources loaded this way.

What is the security model for RDF/XML in XUL?

XUL that is loaded from a "trusted" URL (currently, any chrome: URL) can specify any datasource URI in the datasources attribute of the XUL template.

XUL that is loaded from an "untrusted" URL can only specify an RDF/XML document from the same codebase (in the Java sense of the word) that the XUL document originated from. No "special" (i.e., rdf:) datasources may be loaded from untrusted XUL.

How do I add a datasource to a XUL template?

Although it's possible to create a XUL template with an "implicit" set of datasources by specifying the datasources attribute, there are often times when you won't know the datasource that you want to add until after the XUL is loaded. For example, your XUL may need compute the datasources that it wants to display in an onload handler. Or, it may need to add a datasource later based on some user action.

Here's a simple example that illustrates how to do this. Let's start with the following XUL.

<window xmlns="http://www.mozilla.org/keymaster/gat...re.is.only.xul">
  ...
  <tree id="my-tree" datasources="rdf:null">
    ...
  </tree>
  ...
</window>

Assuming that we've aquired a datasource somehow (e.g., like this), the following sample code illustrates how to add that datasource to the template, and then force the template to rebuild itself based on the newly added datasource's contents.

var ds = /* assume we got this somehow! */;
// Get the DOM element for 'my-tree'
var tree = document.getElementById('my-tree');
// Add our datasource to it
tree.database.AddDataSource(ds);
// Force the tree to rebuild *now*. You have to do this "manually"!
tree.builder.rebuild();

Any XUL element with a datasources attribute will "get" a database property and a builder property. The database property refers to an nsIRDFCompositeDataSource object, which contains the datasources from which the template is built.

The builder property refers to an nsIXULTemplateBuilder object, which is a the "builder" that maintains the state of the template's contents.

Note the rdf:null datasource: this is a special datasource that says, "hey, I don't have a datasource for you yet, but I'm going to add one later, so set yourself up for it!" It causes the database and builder properties to get installed, but leaves the database empty of datasources: you've got to add these in yourself!

Can I manipulate a XUL template using the DOM APIs?

Yes: you can add rules, remove rules, change a query, and change the content that is built by a rule. In fact, you can change anything about a template using the W3C DOM APIs.

The only caveat is that you must call rebuild() before the changes you've made will take effect (just as you must if you add a datasource to the XUL template).

How do I insert plaintext from a template?

To insert plaintext in a template, use the <text> element.

<template>
  <rule>
    <query>...</query>
    <binding>...</binding>
    <action>
      <text value="?some-variable" />
    </action>
  </rule>
</template>

The above template will create a content model that runs a series of text nodes together.

Troubleshooting

Tricks and tips from the field.

My RDF/XML file won't load.

The most common cause for RDF/XML not to load from a web server is incorrect MIME type. Make sure your server is delivering the file as text/xml (recommended) or text/rdf (bogus).

Note that the W3C RDF Core WG are registering application/rdf+xml, though this isn't understood by any Mozilla code yet. (do we have a bug registered to track this? -- danbri)

Another possible problem: for remotely-loaded XUL and RDF, you might need to adjust Mozilla's security restrictions (see belowfor example code). If XUL isn't loading your RDF, and the mimetype is OK, this could well be your problem.

You can use the rdfcat and rdfpoll utilities to verify that the RDF/XML is actually valid. Both of these programs are built by default on Windows, and on Linux when the configure --enable-tests is specified.

  • rdfcat url
    Takes as a parameter a URL from which to read an RDF/XML file, and will "cat" the file back to the console. You can use it to verify that the RDF/XML that you've written is being properly parsed by Mozilla.
  • rdfpoll url [interval ]
    Takes as a parameter a URL from which to read an RDF/XML file. It also accepts an optional poll interval, where it will re-load the URL. It outputs the assertions that are generated from each load. Note that a polling reload generates a set of differences between the current and previous contents of the RDF/XML file. This is useful for debugging generated RDF/XML that may change over time.

Both these programs are slow to load and run (but they will run, eventually). They initialize XPCOM and bring up Necko to be able to load and process URLs just like Mozilla does.

Nothing happens after I call AddDataSource.

Note that the template builder will not rebuild the contents of a template automatically after AddDataSource or RemoveDataSource have been called on the builder's database. To refresh the template's contents, you must manually call elt.builder.rebuild() yourself.

Why? This was done to avoid multiple rebuilds when more than one datasource is added to the database.

Examples

Where can I find some (working) examples?

A few examples are posted here. Some of these are included in signed scripts, and are runnable from HTTP directly.

See also duplicates.rdf (for live RDF feed from Mozilla) alongside duplicates.xul. Note that for these to work, you have to relax Mozilla's security model. To do this, add the following line to your preferences file. (Shut down Mozilla first since it overwrites your preferences file when you quit.)

user_pref("signed.applets.codebase_principal_support", true);

Mozilla will ask you if you want to grant the scripts in duplicates.xul permission to access XPConnect; respond in the affirmative.

Currently, Mozilla does not allow unprivileged access to the RDF interfaces and services; see bug 122846 for details.

Please mail danbri, mozilla-rdf or waterson with URLs if you are aware of other examples to which we ought to link!

Notes

  1. See also W3C RDF and Semantic Web pages for more information on RDF and related technologies.

Contributors

  • Examples section added 2002-07-02 by danbri
  • Thanks to Myk Melez for notes on remote XUL / security policy

Author: Chris Waterson

Original Document Information