Finishing the Component

At this point you have created most of the infrastructure of the component. The component will be recognized by XPCOM and registered with the Category Manager so that it starts up when XPCOM initializes. When the component starts up, it populates a list of URLs read in from a file stored next to the Gecko binary on the local system.

Using Frozen Interfaces

The core functionality of blocking sites is still missing, however. The interfaces needed to block certain URLs from loading are not frozen, and there is still some debate about how exactly this functionality should be exposed to embedders and component developers, so the APIs are not ready to be published. This puts you in the same situation as many developers using Mozilla - you want to use some specific functionality, but the interfaces seem to change on a daily basis.

All of the Mozilla source code is publicly available, and interfaces can be used easily enough. Grab the right headers, use the Component or Service Manager to access the interface you want, and the XPCOM object(s) that implement that interface will do your bidding. With this huge amount of flexibility, however, you lose compatibility. If you use "stuff" that isn't frozen, that stuff is subject to change in future versions of Gecko.

If you want to be protected against changes in Gecko, you must only use interfaces and APIs that are clearly marked as FROZEN. The marking is made in the comments above the interface declaration. For example, take a look at the nsIServiceManager:

/**
 * The nsIServiceManager manager interface provides a means to obtain
 * global services in an application. The service manager depends
 * on the repository to find and instantiate factories to obtain
 * services.
 *
 * Users of the service manager must first obtain a pointer to the
 * global service manager by calling NS_GetServiceManager. After that,
 * they can request specific services by calling GetService.
 * When they are finished they can NS_RELEASE() the service as usual.
 *
 * A user of a service may keep references to particular services
 * indefinitely and only must call Release when it shuts down.
 *
 * @status FROZEN
 */

These frozen interfaces and functions are part of the Gecko SDK. The rule of thumb is that interfaces outside of the SDK are considered "experimental" or unfrozen. See the following sidebar for information about how frozen and unfrozen interfaces can affect your component development, and for technical details about how interface changes beneath your code can cause havoc.

The Danger of Using Unfrozen Interfaces

Suppose that you need to use the interface nsIFoo that isn't frozen. You build your component using this interface, and it works great with the version of Gecko that you have tested against. However, some point in the future, the nsIFoo interface requires a major change, and methods are reordered, some are added, others are removed. Moreover, since this interface was never supposed to be used by clients other than Gecko or Mozilla, the maintainers of the interface don't know that it's being used, and don't change the IID of the interface. When your component runs in a version of Gecko in which this interface is updated, your method calls will be routed through a different v-table than the one the component expected, most likely resulting in a crash.

Below, the component is compiled against a version of the nsIFoo interface that has three methods. The component calls the method TestA and passes an integer, 10. This works fine in any Gecko installation where a contract guarantees that the interface that was compiled against has the same signature. However, when this same component is used in a Gecko installation where this interface has changed, the method TestA does not exist in the nsIFoo interface; the first entry in the v-table is in fact IsPrime(). When this method call is made, the code execution treats the IsPrime method as TestA. Needless to say, this is a bad thing. Furthermore, there is no way easy way to realize this error at runtime.

Image:vtable-of-altered-interface.png

Gecko developers could change the interface's IID, and some do. This can prevent many errors like this. But unfrozen interfaces are not supported in any formal way, and relying upon a different IID for any change in the interface is not a good idea either.

When using frozen interfaces, you are guaranteed compatibility with future versions of Gecko. The only trouble occurs when the compiler itself changes its v-table layout, which can happen when the compiler changes its ABI. For example, in 2002 the GNU Compiler Collection (GCC), version 3.2 changed the C++ ABI, and this caused problems between libraries compiled with GCC 3.2 and applications compiled with an earlier version and vice versa. Similar problems occurred with GCC 4.0, which underwent similar ABI changes.

Before attempting to use unfrozen interfaces, you should contact the developers who are responsible for the code you're trying to use (i.e., module owners) and ask them how best to do what you are trying to do. Be as precise you possibly can. They may be able to suggest a supported alternative, or they may be able to notify you about pending changes.

The interface that we need for this project is something called nsIContentPolicy. At the time this book was written, this interface was under review. An interface reaches this state when a group of module owners and peers are actively engaged in discussion about how best to expose it. Usually there are only minor changes to interfaces marked with such a tag. Even with interfaces marked "under review," however, it's still a good idea to contact the module owners responsible for the interfaces you are interested in using.

Copying Interfaces into Your Build Environment

To get and implement interfaces that are not part of Gecko in your component, simply create a new directory in the Gecko SDK named unfrozen. Copy the headers and IDL files that you need from the content/base/public source directory of the Gecko build into this new directory. (For WebLock, all you need are the headers for nsIContentPolicy and the nsIContentPolicy.idl.) Then, using the same steps you used to create the Weblock.h, create a header from this IDL file using the xpidl compiler. Once you have these interface and header files, you can modify the WebLock class to implement the nsIContentPolicy interface. The Weblock class will then support four interfaces: nsISupports, nsIObserver, nsIContentPolicy, and iWeblock.

Image:weblock-implemented-ifaces.png

WebLock Interfaces

Interface Name Defined by Status Summary
nsISupports XPCOM Frozen Provides interface discovery, and object reference counting
nsIObserver XPCOM Frozen Allows messaging passing between objects
nsIContentPolicy Content Not Frozen Interface for policy control mechanism
iWeblock Web Lock Not Frozen Enables and disables Weblock. Also, provides access to the URL that are whitelisted.

Implementing the nsIContentPolicy Interface

To implement the new interface, you must #include the unfrozen nsIContentPolicy, and you must also make sure the build system can find the file you've brought over. The location of the file and the steps for adding that location to the build system vary depending on how you build this component.

Once you have made sure that your component builds with the new header file, you must derive the Weblock class from the interface nsIContentPolicy, which you can do by simply adding a public declaration when defining the class. At the same time, you can add the macro NS_DECL_NSICONTENTPOLICY to the class declaration that provides all of the methods defined in the interface nsIContentPolicy. The updated WebLock class looks as follows:

class WebLock: public nsIObserver, public iWeblock, public nsIContentPolicy
{
  public:
    WebLock();
    virtual ~WebLock();

    NS_DECL_ISUPPORTS
    NS_DECL_NSIOBSERVER
    NS_DECL_IWEBLOCK
    NS_DECL_NSICONTENTPOLICY

  private:
    urlNode* mRootURLNode;
    PRBool   mLocked;
};

Remember to change the nsISupports implementation macro to include nsIContentPolicy so that other parts of Gecko will know WebLock supports the nsIContentPolicy interface without modifying this macro.

NS_IMPL_ISUPPORTS3(WebLock, nsIObserver, iWeblock, nsIContentPolicy);

Receiving Notifications

To receive notifications, you must register as a new category. You have already registered as a category to receive startup notification. This time, the category name to use is "content-policy". To add the WebLock component to this category, modify the WebLockRegistration callback function so that it looks like this:

static NS_METHOD WebLockRegistration(nsIComponentManager *aCompMgr,
                                     nsIFile *aPath,
                                     const char *registryLocation,
                                     const char *componentType,
                                     const nsModuleComponentInfo *info)
{
  nsresult rv;
  nsCOMPtr<nsIServiceManager> servman = do_QueryInterface((nsISupports*)aCompMgr, &rv);
  if (NS_FAILED(rv))
    return rv;

  nsCOMPtr<nsICategoryManager> catman;
  servman->GetServiceByContractID(NS_CATEGORYMANAGER_CONTRACTID,
                                  NS_GET_IID(nsICategoryManager),
                                  getter_AddRefs(catman));
  if (NS_FAILED(rv))
    return rv;

  char* previous = nsnull;
  rv = catman->AddCategoryEntry("xpcom-startup",
                                "WebLock",
                                WebLock_ContractID,
                                PR_TRUE,
                                PR_TRUE,
                                &previous);
  if (previous)
    nsMemory::Free(previous);

  rv = catman->AddCategoryEntry("content-policy",
                                "WebLock",
                                WebLock_ContractID,
                                PR_TRUE,
                                PR_TRUE,
                                &previous);
  if (previous)
    nsMemory::Free(previous);
  return rv;
}

This code adds a new category entry under the topic "content-policy," and it calls AddCategoryEntry in the same way we did in Registering for Notifications. A similar step is required for unregistration.

Implementing the nsIContentPolicy

At this point, you can take the WebLock component and install it into a Gecko installation. When the component is loaded, Gecko calls the nsIContentPolicy implementation in WebLock on every page load, and this prevents pages from displaying by returning the proper value when the load method is called.

The web locking policy that we are going to put into place is quite simple: for every load request that comes through, we will ensure that the URI is in the list of "good" URLs on the white list.

If you care to extend this implementation so that the list of URLs is held remotely on a server somewhere - as might be the case when the WebLock component is used in a corporate intranet, for example - there are Networking APIs in Gecko that will support this. Or you could implement the web lock so that instead of blocking any site, the component would simply log all URLs that are loaded. In any case, the process to make the XPCOM component is the same.

The method that handles the check before page loading and the only method we care about in our own implementation of nsIContentPolicy is ShouldLoad(). The other method on the nsIContentPolicy interface is for blocking processing of specific elements in a document, but our policy is more restrictive: if the URL isn't on the white list, the entire page should be blocked. In the WebLock component, the ShouldLoad method looks like this:

NS_IMETHODIMP WebLock::ShouldLoad(PRInt32 contentType,
                                  nsIURI *contentLocation,
                                  nsISupports *ctxt,
                                  nsIDOMWindow *window,
                                  PRBool *_retval)

Uniform Resource Locators

The method passes in an interface pointer of type nsIURI, which is based on the Uniform Resource Identifier, or URI. This type is defined by the World Wide Web Consortium as:

  • The naming scheme of the mechanism used to access the resource.
  • The name of the machine hosting the resource.
  • The name of the resource itself, given as a path.

In this context, URIs are the strings used refer to places or things on the web. This specific form of URI is called a Uniform Resource Locator, or URL. See the intro to the HTML 4 specification for more information about URIs and URLs.

Gecko encapsulates these identifiers into two interfaces, nsIURI and nsIURL. You can QueryInterface between these two interfaces. The networking library, Necko, deals only with these interfaces when handling requests. When you want to download a file using Necko, for example, all you probably have is a string that represents the URI of the file. When you pass that string to Necko, it creates an object that implements at least the nsIURI interface (and perhaps other interfaces as well).

Currently, the WebLock implementation of the ShouldLoad method compares the in parameter with each string in the white list. But it only should do this comparison for remote URLs, because we don't want to block the application from loading local content that it requires, like files it gets via the resource:// protocol. If URIs of this kind are blocked, then Gecko will not be able to start up, so we'll restrict the content policy to the HTTP and FTP protocols.

Instead of extracting the string spec out of the nsIURI to do a string comparison, which would require you to do the parsing yourself, you can compare the nsIURI objects with each other, as in the following section. This ensures that the URLs are canonical before they are compared.

Checking the White List

The WebLock implementation of the ShouldLoad method starts by extracting the scheme of the incoming nsIURI. If the scheme isn't "http", "https", or "ftp", it immediately returns true, which continues the loading process unblocked.

These three are the only kinds of URI that Weblock will try to block. When it has one, it walks the linked list and creates a new nsIURI object for each string URL in the list. From each object, ShouldLoad() extracts the host and compares it to the URI. If they match, the component allows the load to continue by returning true. If these two strings do not match, then the component returns return false and blocks the load.

URI Caching

Caching the URI would make this method implementation much faster by avoiding the need to create and destroy so many objects. This points out an important drawback of XPCOM, which is that you cannot create an object on the stack.

Creating this many objects is OK in a tight loop if the buffer of memory that holds the contents of the URLs is guaranteed to be valid for the lifetime of the object. But regardless of how optimized the implementation is with respect to is memory usage, a heap allocation will be made for every XPCOM object created.

The string comparison with the URL type "http", "https", and "ftp" looks like this:

nsEmbedCString scheme;
contentLocation->GetScheme(scheme);

if (strcmp("http", scheme.get())  != 0 &&
    strcmp("https", scheme.get()) != 0 &&
    strcmp("ftp", scheme.get())   != 0)
{
  // this isn't a type of URI that we deal with.
  *_retval = PR_TRUE;
  return NS_OK;
}

Creating nsIURI Objects

To create an nsIURI, use nsIIOService. nsIIOService is the part of the networking library ("necko") that's responsible for kicking off network requests, managing protocols such as http, ftp, or file, and creating nsIURIs. Necko offers tremendous network functionality, but all the WebLock component needs is to create the nsIURI object that can be compared with the URIs on the white list.

Use the Service Manager to acquire the nsIIOService. Since this object is going to be used for the life of the component, it can also be cached. A good place to get an nsIIOService is in the component's Observe() method, which already has a pointer to the Service Manager. The code for getting the IO service from the Service Manager looks like this:

// Get a pointer to the IOService
rv = servMan->GetServiceByContractID("@mozilla.org/network/io-service;1",
                                     NS_GET_IID(nsIIOService),
                                     getter_AddRefs(mIOService));

Once you have this interface pointer, you can easily create nsIURI objects from a string, as in the following snippet:

nsCOMPtr<nsIURI> uri;
nsEmbedCString urlString(node->urlString);
mIOService->NewURI(urlString,
                   nsnull,
                   nsnull,
                   getter_AddRefs(uri));

This code wraps a C-string with a nsEmbedCString, which you'll recall is a string class that many of the Gecko APIs require. See String Classes in XPCOM for more information about strings.

Once the URL string is wrapped in a nsEmbedCString, it can be passed to the method NewURI. This method expects to parse the incoming string and create an object which implements an nsIURI interface. The two nsnull parameters passed to NewURI are used to specify the charset of the string and any base URI to use, respectively. We are assuming here that the charset of the URL string is UTF-8, and also assuming that every URL string is absolute. See the intro to the HTML 4 specification for more information about relative URLs.

Here is the complete implementation of the ShouldLoad() method:

NS_IMETHODIMP
WebLock::ShouldLoad(PRInt32 contentType,
                    nsIURI *contentLocation,
                    nsISupports *ctxt,
                    nsIDOMWindow *window,
                    PRBool *_retval)
{
  if (!contentLocation)
    return NS_ERROR_FAILURE;


  nsEmbedCString scheme;
  contentLocation->GetScheme(scheme);

  if (strcmp("http", scheme.get())  != 0 &&
      strcmp("https", scheme.get()) != 0 &&
      strcmp("ftp",  scheme.get())  != 0)
  {
    // this isn't a type of URI that we deal with
    *_retval = PR_TRUE;
    return NS_OK;
  }

  nsEmbedCString hostToLoad;
  contentLocation->GetHost(hostToLoad);

  // Assume failure.  Do not allow this nsIURI to load.
  *_retval = PR_FALSE;

  nsresult rv;

  nsCOMPtr<nsIServiceManager> servMan;
  rv = NS_GetServiceManager(getter_AddRefs(servMan));
  if (NS_FAILED(rv))
    return rv;

  nsCOMPtr<nsIIOService> mIOService;
  // Get a pointer to the IOService
  rv = servMan->GetServiceByContractID("@mozilla.org/network/io-service;1",
                                       NS_GET_IID(nsIIOService),
                                       getter_AddRefs(mIOService));
  if (NS_FAILED(rv))
    return rv;

  urlNode* node = mRootURLNode;
  PRBool match = PR_FALSE;

  while (node)
  {
    nsCOMPtr<nsIURI> uri;
    nsEmbedCString urlString(node->urlString);
    rv = mIOService->NewURI(urlString, nsnull,  nsnull, getter_AddRefs(uri));

    // if anything bad happens, just abort
    if (NS_FAILED(rv))
      return rv;

    nsEmbedCString host;
    uri->GetHost(host);

    if (strcmp(hostToLoad.get(), host.get()) == 0)
    {
      // match found.  Allow this nsIURI to load
      *_retval = PR_TRUE;
      return NS_OK;
    }
    node = node->next;
  }
  return NS_OK;
}

At this point, all of the backend work is complete. You can of course improve this backend in many ways, but this example presents the basic creation of what is commonly referred to as a "browser helper object" like WebLock. The next chapter looks at how to tie this into the front end - specifically, how to use XPConnect to access and control this component from JavaScript in the user interface.

Copyright (c) 2003 by Doug Turner and Ian Oeschger. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.02 or later. Distribution of substantively modified versions of this document is prohibited without the explicit permission of the copyright holder. Distribution of the work or derivative of the work in any standard (paper) book form is prohibited unless prior permission is obtained from the copyright holder.