An Overview of XPCOM

This is a book about XPCOM. The book is presented as a tutorial about creating XPCOM components, but it covers all major aspects, concepts, and terminology of the XPCOM component model along the way.

This chapter provides a quick tour of XPCOM - an introduction to the basic concepts and technologies in XPCOM and component development. The brief sections in this chapter introduce the concepts at a very high level, so that we can discuss and use them with more familiarity in the tutorial itself, which describes the creation of a Mozilla component called WebLock.

The XPCOM Solution

The Cross Platform Component Object Module (XPCOM) is a framework which allows developers to break up monolithic software projects into smaller modularized pieces. These pieces, known as components, are then assembled back together at runtime.

The goal of XPCOM is to allow different pieces of software to be developed and built independently of one another. In order to allow interoperability between components within an application, XPCOM separates the implementation of a component from the interface, which we discuss in Interfaces. But XPCOM also provides several tools and libraries that enable the loading and manipulation of these components, services that help the developer write modular cross-platform code, and versioning support, so that components can be replaced or upgraded without breaking or having to recreate the application. Using XPCOM, developers create components that can be reused in different applications or that can be replaced to change the functionality of existing applications.

XPCOM not only supports component software development, it also provides much of the functionality that a development platform provides, such as:

  • component management
  • file abstraction
  • object message passing
  • memory management

We will discuss the above items in detail in the coming chapters, but for now, it can be useful to think of XPCOM as a platform for component development, in which features such as those listed above are provided.

Gecko

Although it is in some ways structurally similar to Microsoft COM, XPCOM is designed to be used principally at the application level. The most important use of XPCOM is within Gecko, an open source, standards compliant, embeddable web browser and toolkit for creating web browsers and other applications.

XPCOM is the means of accessing Gecko library functionality and embedding or extending Gecko. This book focuses on the latter - extending Gecko - but the fundamental ideas in the book will be important to developers embedding Gecko as well.

Gecko is used in many internet applications, mostly browsers and most notably Mozilla Firefox.

Components

XPCOM allows you to break up large software projects into smaller pieces known as components. They are usually contained in reusable binary libraries (a DLL on Windows, for example, or a DSO on Unix), which can include one or more components. When two or more related components are grouped together in a binary library, the library is referred to as a module.

Modular, component-based programming makes software easier to develop and maintain and has some well-known advantages:

Benefit Description
Reuse Modular code can be reused in other applications and other contexts.
Updates You can update components without having to recompile the whole application.
Performance When code is modularized, modules that are not necessary right away can be "lazy loaded", or not loaded at all, which can improve the performance of your application.
Maintenance Even when you are not updating a component, designing your application in a modular way can make it easier for you to find and maintain the parts of the application that you are interested in.

Mozilla has over four million lines of code, and no single individual understands the entire codebase. The best way to tackle a project of this size is to divide it into smaller, more manageable pieces, use a component programming model, and to organize related sets of components into modules. The network library, for example, consists of components for each of the protocols, HTTP, FTP, and others, which are bundled together and linked into a single library. This library is the networking module, also known as "necko."

But it's not always a good idea to divide things up. There are some things in the world that just go together, and others that shouldn't be apart. For example, one author's son will not eat a peanut-butter sandwich if there isn't jam on it, because in his world, peanut butter and jam form an indelible union. Some software is the same. In areas of code that are tightly-coupled—in classes that are only used internally, for example—the expensive work to divide things may not be worth the effort.

The HTTP component in Gecko doesn't expose private classes it uses as separate components. The "stuff" that's internal to the component stays internal, and isn't exposed to XPCOM. In the haste of early Mozilla development, components were created where they were inappropriate, but there's been an ongoing effort to remove XPCOM from places like this.

Interfaces

How do you decide when to break apart your code? The basic idea is to identify the pieces of functionality that are related and understand how they communicate with each other. The communication channels between different component form boundaries between those components, and when those boundaries are formalized they are known as interfaces.

Interfaces aren't a new idea in programming. We've all used interfaces since our first "HelloWorld" program, where the interface was between the code we actually wrote -- the application code -- and the printing code. The application code used an interface from a library, stdio, to print the "hello world" string out to the screen. The difference here is that a "HelloWorld" application in XPCOM finds this screen-printing functionality at runtime and never has to know about stdio when it's compiled.

Interfaces allow developers to encapsulate the implementation and inner workings of their software, and allow clients to ignore how things are made and just use that software.

Interfaces and Programming by Contract

An interface forms a contractual agreement between components and clients. There is no code that enforces these agreements, but ignoring them can be fatal. In component-based programming, a component guarantees that the interfaces it provides will be immutable - that they will provide the same access to the same methods across different versions of the component - establishing a contract with the software clients that use it. In this respect, interface-based programming is often referred to as programming by contract.

Interfaces and Encapsulation

Between component boundaries, abstraction is crucial for software maintainability and reusability. Consider, for example, a class that isn't well encapsulated. Using a freely available public initialization method, as the example below suggests, can cause problems.

SomeClass Class Initialization

class SomeClass
{
  public:
    // Constructor
    SomeClass();

    // Virtual Destructor
    virtual ~SomeClass();

    // init method
    void Init();

    void DoSomethingUseful();
};

For this system to work properly, the client programmer must pay close attention to whatever rules the component programmer has established. This is the contractual agreement of this unencapsulated class: a set of rules that define when each method can be called and what it is expected to do. One rule might specify that DoSomethingUseful may only be called after a call to Init(). The DoSomethingUseful method may do some kind of checking to ensure that the condition - that Init has been called - has been satisfied.

In addition to writing well-commented code that tells the client developer the rules about Init(), the developer can take a couple steps to make this contract even clearer. First, the construction of an object can be encapsulated, and a virtual class provided that defines the DoSomethingUseful method. In this way, construction and initialization can be completely hidden from clients of the class. In this "semi-encapsulated" situation, the only part of the class that is exposed is a well-defined list of callable methods (i.e., the interface). Once the class is encapsulated, the only interface the client will see is this:

Encapsulation of SomeInterface

class SomeInterface
{
  public:
    virtual void DoSomethingUseful() = 0;
};

The implementation can then derive from this class and implement the virtual method. Clients of this code can then use a factory design pattern to create the object (see Factories) and further encapsulate the implementation. In XPCOM, clients are shielded from the inner workings of components in this way and rely on the interface to provide access to the needed functionality.

The nsISupports Base Interface

Two fundamental issues in component and interface-based programming are component lifetime, also called object ownership, and interface querying, or being able to identify which interfaces a component supports at runtime. This section introduces the base interface-the mother of all interfaces in XPCOM - nsISupports, which provides solutions to both of these issues for XPCOM developers.

Object Ownership

In XPCOM, since components may implement any number of different interfaces, components must be reference counted. When a component gets created, an integer inside the component tracks the number of clients who have an interface to the components -- also known as the reference count. The reference count is incremented automatically when the client instantiates the component; over the course of the component's life, the reference count goes up and down. When all clients lose interest in the component, the reference count hits zero, and the component deletes itself.

When clients use interfaces responsibly, this can be a very straightforward process. XPCOM has tools to make it even easier, as we describe later. It can raise some real housekeeping problems when, for example, a client uses an interface and forgets to decrement the reference count. When this happens, interfaces may never be released and will leak memory. The system of reference counting is, like many things in XPCOM, a contract between clients and implementations. It works when people agree to it, but when they don't, things can go wrong. It is the responsibility of the function that creates the interface pointer to add the initial reference, or owning reference, to the count.

Pointers in XPCOM

In XPCOM, pointers refer to interface pointers. The difference is a subtle one, since interface pointers and regular pointers are both just addresses in memory. But an interface pointer is known to implement the nsISupports base interface, and so can be used to call methods such as AddRef, Release, or QueryInterface.

nsISupports, shown below, supplies the basic functionality for dealing with interface discovery and reference counting. The members of this interface, QueryInterface, AddRef, and Release, provide the basic means for getting the right interface from an object, incrementing the reference count, and releasing objects once they are not being used, respectively. The nsISupports interface is shown below:

The nsISupports Interface

class Sample: public nsISupports
{
  private:
    nsrefcnt mRefCnt;
  public:
    Sample();
    virtual ~Sample();

    NS_IMETHOD QueryInterface(const nsIID &aIID, void **aResult);
    NS_IMETHOD_(nsrefcnt) AddRef(void);
    NS_IMETHOD_(nsrefcnt) Release(void);
};

The various types used in the interface are described in the XPCOM Types section below. A complete (if spare) implementation of the nsISupports interface is shown below. See A Reference Implementation of QueryInterface for detailed information.

Implementation of nsISupports Interface

// initialize the reference count to 0
Sample::Sample() : mRefCnt(0)
{
}
Sample::~Sample()
{
}

// typical, generic implementation of QI
NS_IMETHODIMP Sample::QueryInterface(const nsIID &aIID,
                                  void **aResult)
{
  if (!aResult) {
    return NS_ERROR_NULL_POINTER;
  }
  *aResult = NULL;
  if (aIID.Equals(kISupportsIID)) {
    *aResult = (void *) this;
  }
  if (!*aResult) {
    return NS_ERROR_NO_INTERFACE;
  }
  // add a reference
  AddRef();
  return NS_OK;
}

NS_IMETHODIMP_(nsrefcnt) Sample::AddRef()
{
  return ++mRefCnt;
}

NS_IMETHODIMP_(nsrefcnt) Sample::Release()
{
  if (--mRefCnt == 0) {
    delete this;
    return 0;
  }
  // optional: return the reference count
  return mRefCnt;
}

Object Interface Discovery

Inheritance is another very important topic in object oriented programming. Inheritance is the means through which one class is derived from another. When a class inherits from another class, the inheriting class may override the default behaviors of the base class without having to copy all of that class's code, in effect creating a more specific class, as in the following example:

Simple Class Inheritance

class Shape
{
  private:
    int m_x;
    int m_y;

  public:
    virtual void Draw() = 0;
    Shape();
    virtual ~Shape();
};

class Circle : public Shape
{
  private:
    int m_radius;
  public:
    virtual Draw();
    Circle(int x, int y, int radius);
    virtual ~Circle();
};

Circle is a derived class of Shape. A Circle is a Shape, in other words, but a Shape is not necessarily a Circle. In this case, Shape is the base class and Circle is a subclass of Shape.

In XPCOM, all classes derive from the nsISupports interface, so all objects are nsISupports but they are also other, more specific classes, which you need to be able to find out about at runtime. In Simple Class Inheritance, for example, you'd like to be able ask the Shape if it's a Circle and to be able to use it like a circle if it is. In XPCOM, this is what the QueryInterface feature of the nsISupports interface is for: it allows clients to find and access different interfaces based on their needs.

In C++, you can use a fairly advanced feature known as a dynamic_cast<>, which throws an exception if the Shape object is not able to be cast to a Circle. But enabling exceptions and RTTI may not be an option because of performance overhead and compatibility on many platforms, so XPCOM does things differently.

Exceptions in XPCOM

C++ exceptions are not supported directly by XPCOM. Instead all exceptions must be handled within a given component, before crossing interface boundaries. In XPCOM, all interface methods should return an nsresult error value (see the XPCOM API Reference for a listing of these error codes). These error code results become the "exceptions" that XPCOM handles.

Instead of leveraging C++ RTTI, XPCOM uses the special QueryInterface method that casts the object to the right interface if that interface is supported.

Every interface is assigned an identifier that gets generated from a tool commonly named "uuidgen". This universally unique identifier (UUID) is a unique, 128-bit number. Used in the context of an interface (as opposed to a component, which is what the contract ID is for), this number is called an IID.

When a client wants to discover if an object supports a given interface, the client passes the IID assigned to that interface into the QueryInterface method of that object. If the object supports the requested interface, it adds a reference to itself and passes back a pointer to that interface. If the object does not support the interface an error is returned.

class nsISupports {
  public:
    long QueryInterface(const nsIID & uuid,
                        void **result) = 0;
    long AddRef(void) = 0;
    long Release(void) = 0;
};

The first parameter of QueryInterface is a reference to a class named nsIID, which is a basic encapsulation of the IID. Of the three methods on the nsIID class, Equals, Parse, and ToString, Equals is by far the most important, because it is used to compare two nsIIDs in this interface querying process.

When you implement the nsISupports class (and you'll see in the chapter Using XPCOM Utilities to Make Things Easier how macros can make this process much easier), you must make sure the class methods return a valid result when the client calls QueryInterface with the nsISupports IID. QueryInterface should support all interfaces that the component supports.

In implementations of QueryInterface, the IID argument is checked against the nsIID class. If there is a match, the object's this pointer is cast to void, the reference count is incremented, and the interface returned to the caller. If there isn't a match, the class returns an error and sets the out value to null.

In the example above, it's easy enough to use a C-style cast. But casting can become more involved where you must first cast void then to the requested type, because you must return the interface pointer in the vtable corresponding to the requested interface. Casting can become a problem when there is an ambiguous inheritance hierarchy.

XPCOM Identifiers

In addition to the IID interface identifier discussed in the previous section, XPCOM uses two other very important identifiers to distinguish classes and components.

XPCOM Identifier Classes

The nsIID class is actually a typedef for the nsID class. The other typedefs of nsID, CID and IID, refer to specific implementations of a concrete class and to a specific interface, respectively.

The nsID class provides methods like Equals for comparing identifiers in the code. See Identifiers in XPCOM for more discussion of the nsID class.

CID

A CID is a 128-bit number that uniquely identifies a class or component in much the same way that an IID uniquely identifies an interface. The CID for nsISupports looks like this:

00000000-0000-0000-c000-000000000046

The length of a CID can make it cumbersome to deal with in the code, so very often you see #defines for CIDs and other identifiers being used, as in this example:

#define SAMPLE_CID \
{ 0x777f7150, 0x4a2b, 0x4301, \
{ 0xad, 0x10, 0x5e, 0xab, 0x25, 0xb3, 0x22, 0xaa}}

You also see NS_DEFINE_CID used a lot. This simple macro declares a constant with the value of the CID:

static NS_DEFINE_CID(kWebShellCID, NS_WEB_SHELL_CID);

A CID is sometimes also referred to as a class identifier. If the class to which a CID refers implements more than one interface, that CID guarantees that the class implements that whole set of interfaces when it's published or frozen.

Contract ID

A contract ID is a human readable string used to access a component. A CID or a contract ID may be used to get a component from the component manager. This is the contract ID for the LDAP Operation component:

"@mozilla.org/network/ldap-operation;1"

The format of the contract ID is the domain of the component, the module, the component name, and the version number, separated by slashes.

Like a CID, a contract ID refers to an implementation rather than an interface, as an IID does. But a contract ID is not bound to any specific implementation, as the CID is, and is thus more general. Instead, a contract ID only specifies a given set of interfaces that it wants implemented, and any number of different CIDs may step in and fill that request. This difference between a contract ID and a CID is what makes it possible to override components.

Factories

Once code is broken up into components, client code typically uses the new operator to instantiate objects for use:

SomeClass* component = new SomeClass();

This pattern requires that the client knows something about the component, however-how big it is at the very least. The factory design pattern can be used to encapsulate object construction. The goal of a factory is to create an object without exposing clients to the implementation and initialization of this object. In the SomeClass example, the construction and initialization of SomeClass, which implements the SomeInterface abstract class, is contained within the New_SomeInterface function, which follows the factory design pattern:

Encapsulating the Constructor

int New_SomeInterface(SomeInterface** ret)
{
  // create the object
  SomeClass* out = new SomeClass();
  if (!out) return -1;

  // init the object
  if (out->Init() == FALSE)
  {
    delete out;
    return -1;
  }

  // cast to the interface
  *ret = static_cast<SomeInterface*>(out);
  return 0;
}

The factory is the class that actually manages the creation of separate instances of a component for use. In XPCOM, factories are implementations of the nsIFactory interface, and they use a factory design pattern like the example above to abstract and encapsulate object construction and initialization.

The example in Encapsulating the Constructor is a simple and stateless version of factories, but real world programming isn't usually so simple, and in general factories need to store state. At a minimum, the factory needs to preserve information about what objects it has created. When a factory manages instances of a class built in a dynamic shared library, for example, it needs to know when it can unload the library. When the factory preserves state, you can ask if there are outstanding references and find out if the factory created any objects.

Another state that a factory can save is whether or not an object is a singleton. For example, if a factory creates an object that is supposed to be a singleton, then subsequent calls to the factory for the object should return the same object. Though there are tools and better ways to handle singletons (which we'll discuss when we talk about the nsIServiceManager), a developer may want to use this information to ensure that only one singleton object can exist despite what the callers do.

The requirements of a factory class can be handled in a strictly functional way, with state being held by global variables, but there are benefits to using classes for factories. When you use a class to implement the functionality of a factory, for example, you derive from the nsISupports interface, which allows you to manage the lifetime of the factory objects themselves. This is important when you want to group sets of factories together and determine if they can be unloaded. Another benefit of using the nsISupports interface is that you can support other interfaces as they are introduced. As we'll show when we discuss nsIClassInfo, some factories support querying information about the underlying implementation, such as what language the object is written in, interfaces that the object supports, etc. This kind of "future-proofing" is a key advantage that comes along with deriving from nsISupports.

XPIDL and Type Libraries

An easy and powerful way to define an interface - indeed, a requirement for defining interfaces in a cross-platform, language neutral development environment - is to use an interface definition language (IDL). XPCOM uses its own variant of the CORBA OMG Interface Definition Language (IDL) called XPIDL, which allows you to specify methods, attributes and constants of a given interface, and also to define interface inheritance.

There are some drawbacks to defining your interface using XPIDL. There is no support for multiple inheritance, for one thing. If you define a new interface, it cannot derive from more than one interface. Another limitation of interfaces in XPIDL is that method names must be unique. You can not have two methods with the same name that take different parameters, and the workaround - having multiple function names - isn't pretty:

void FooWithInt(in int x);
void FooWithString(in string x);
void FooWithURI(in nsIURI x);

However, these shortcomings pale in comparison to the functionality gained by using XPIDL. XPIDL allows you to generate type libraries, or typelibs, which are files with the extension .xpt. The type library is a binary representation of an interface or interfaces. It provides programmatic control and access of the interface, which is crucial for interfaces used in the non C++ world. When components are accessed from other languages, as they can be in XPCOM, they use the binary type library to access the interface, learn what methods it supports, and call those methods. This aspect of XPCOM is called XPConnect. XPConnect is the layer of XPCOM that provides access to XPCOM components from languages such as JavaScript. See Connecting to Components from the Interface for more information about XPConnect.

When a component is accessible from a language other than C++, such as JavaScript, its interface is said to be "reflected" into that language. Every reflected interface must have a corresponding type library. Currently you can write components in C, C++, or JavaScript (and sometimes Python or Java, depending on the state of the respective bindings), and there are efforts underway to build XPCOM bindings for Ruby and Perl as well.

Writing Components in Other Languages

Though you do not have access to some of the tools that XPCOM provides for C++ developers (such as macros, templates, smart pointers, and others) when you create components in other languages, you may be so comfortable with the language itself that you can eschew C++ altogether and build, for example, Python-based XPCOM components that can be used from JavaScript or vice versa.

See Resources for more information about Python and other languages for which support has been added in XPCOM.

All of the public interfaces in XPCOM are defined using the XPIDL syntax. Type libraries and C++ header files are generated from these IDL files, and the tool that generates these files is called the xpidl compiler. The section Defining the WebLock Interface in XPIDL describes the XPIDL syntax in detail.

XPCOM Services

When clients use components, they typically instantiate a new object each time they need the functionality the component provides. This is the case when, for example, clients deal with files: each separate file is represented by a different object, and several file objects may be being used at any one time.

But there is also a kind of object known as a service, of which there is always only one copy (though there may be many services running at any one time). Each time a client wants to access the functionality provided by a service, they talk to the same instance of that service. When a user looks up a phone number in a company database, for example, probably that database is being represented by an "object" that is the same for all co-workers. If it weren't, the application would need to keep two copies of a large database in memory, for one thing, and there might also be inconsistencies between records as the copies diverged.

Providing this single point of access to functionality is what the singleton design pattern is for, and what services do in an application (and in a development environment like XPCOM).

In XPCOM, in addition to the component support and management, there are a number of services that help the developer write cross platform components. These services include a cross platform file abstraction which provides uniform and powerful access to files, directory services which maintain the location of application- and system-specific locations, memory management to ensure everyone uses the same memory allocator, and an event notification system that allows passing of simple messages. The tutorial will show each of these component and services in use, and the XPCOM API Reference has a complete interface listing of these areas.

XPCOM Types

There are many XPCOM declared types and simple macros that we will use in the following samples. Most of these types are simple mappings. The most common types are described in the following sections.

Method Types

The following are a set of types for ensuring correct calling convention and return type of XPCOM methods.

NS_IMETHOD Method declaration return type. XPCOM method declarations should use this as their return type.
NS_IMETHODIMP Method Implementation return type. XPCOM method implementations should use this as their return type.
NS_IMETHODIMP_(type) Special case implementation return type. Some methods such as AddRef and Release do not return the default return type. This exception is regrettable, but required for COM compliance.
NS_IMPORT Forces the method to be resolved internally by the shared library.
NS_EXPORT Forces the method to be exported by the shared library.

Reference Counting

These macros manage reference counting.

NS_ADDREF Calls AddRef on an nsISupports object.
NS_IF_ADDREF Same as above but checks for null before calling AddRef.
NS_RELEASE Calls Release on an nsISupports object.
NS_IF_RELEASE Same as above but check for null before calling Release.

Status Codes

These macros test status codes.

NS_FAILED Return true if the passed status code was a failure.
NS_SUCCEEDED Returns true is the passed status code was a success.

Variable Mappings

nsrefcnt Default reference count type. Maps to a 32-bit integer.
nsresult Default error type. Maps to a 32-bit integer.
nsnull Default null value.

Common XPCOM Error Codes

NS_ERROR_NOT_INITIALIZED Returned when an instance is not initialized.
NS_ERROR_ALREADY_INITIALIZED Returned when an instance is already initialized.
NS_ERROR_NOT_IMPLEMENTED Returned by an unimplemented method.
NS_ERROR_NO_INTERFACE Returned when a given interface is not supported.
NS_ERROR_NULL_POINTER Returned when a valid pointer is found to be nsnull.
NS_ERROR_FAILURE Returned when a method fails. Generic error case.
NS_ERROR_UNEXPECTED Returned when an unexpected error occurs.
NS_ERROR_OUT_OF_MEMORY Returned when a memory allocation fails.
NS_ERROR_FACTORY_NOT_REGISTERED Returned when a requested class is not registered.

Copyright (c) 2003 by Doug Turner and Ian Oeschger. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.02 or later. Distribution of substantively modified versions of this document is prohibited without the explicit permission of the copyright holder. Distribution of the work or derivative of the work in any standard (paper) book form is prohibited unless prior permission is obtained from the copyright holder.