Chapter 2: Technologies used in developing extensions

Draft
This page is not complete.

This document was authored by Hiroshi Shimoda of Clear Code Inc. and was originally published in Japanese for the Firefox Developers Conference Summer 2007. Shimoda-san is a co-author of Firefox 3 Hacks (O'Reilly Japan, 2008).

Before we dive into a thorough explanation, we'll quickly introduce the technologies used to develop Firefox extensions. We will also look at the minimum knowledge you'll need to have in order to develop for Firefox.

Technologies used to develop Firefox extensions

Firefox and its extensions are both based on and developed with technologies widely used on the web. Its structure is similar to that of the dynamic HTML used on some web pages, or the HTML applications used on Windows. If you've had experience developing with dynamic HTML, you'll probably find it relatively easy to pick up the knowledge you'll need to develop Firefox extensions.

The role of each technology

Firefox is largely built using four technologies: XUL, CSS, JavaScript, and XPCOM. Extensions are also built using these four technologies.

Figure 1: role of each technology in Firefox

In addition to these technologies, extension development will require you to learn about how to confer privileges to overcome security restrictions on code that you write, and how to embed your code into the Firefox UI. These issues are discussed in Chapter 5.

The minimum knowledge required

In the interest of brevity, I will omit explanations of widely understood technologies, and focus instead on introducing new technologies you will need to understand in order to develop for Firefox. I will assume that you have experience developing with dynamic HTML, as well as the topics below. For more information on these technologies, please refer to other sources.

XML: A text-based structural language

Extensible Markup Language (XML) is a meta-language for expressing various kinds of data. It was specified in 1998 by W3C, the organization that sets standards for web-related technologies. It has a number of useful qualities: it is generic, extensible, and easy to validate as well-formed.

Some examples of XML-based markup languages include XHTML, which is HTML redefined on an XML base; SVG, for expressing vector images; and MathML, for expressing mathematical formulas. XUL, which is used in Firefox, is also based on XML.

Listing 1: XML syntax

<elementname someattribute="somevalue">
  content
</elementname>

As shown in Listing 1, XML uses elements, which consist of an opening tag, a closing tag, and content.

Note: Elements that take no content can be expressed in compact form as <elementname/>.

An element can include other elements as well as text in its content, and all information is structured as a tree. As in all trees, elements can have children (elements contained within them) and parents (elements that contain them). Attributes can also be added to opening tags, each with a value.

As the "extensible" part of XML implies, elements from various XML-based languages such as XHTML and SVG can be interspersed in one another as a means to extend the language. All elements can carry a "namespace URI" identifier, which is unique for each language. For example, even though XHTML and SVG have elements with the same name, these can be distinguished. The namespace URI for XHTML is "http://www.w3.org/1999/xhtml" ; for SVG is it "http://www.w3.org/2000/svg".

CSS: A style language to alter the display of XML documents

Like XML, Cascading Style Sheets (CSS) is a technical specification established by the W3C; it is a style-description language defining the display of data marked up in XML and HTML. As shown in Listing 1, it has an extremely simple syntax. By separating the structure of the data, expressed through HTML or XML, and the display style, indicated by CSS, data can be reused better than it is when structural and stylistic markup are both embedded in HTML.

There are three CSS specifications (Level 1 through Level 3), with progressively powerful features. The Gecko rendering engine handles nearly all of CSS Level 2 and some of CSS Level 3.

Listing 2: CSS code sample

body {
  color: black;
  background-color: white;
}
p {
  margin-bottom: 1em;
  text-indent: 1em;
}

JavaScript: The world's most misunderstood language

JavaScript is a scripting language first developed in the 1990s, at which time it was created as a way to add dynamic features to web pages. Because it was often used at first to display pop-up windows, marching text in status bars, or in other ways that made web pages less useful to users, the language acquired a reputation as having little practical use and lacking in functionality.

Also, because a series of security holes were discovered in JavaScript and the compatible technology JScript, there was an initial reluctance to use JavaScript at all.

Nevertheless, the rise of web services like Google Maps, which used JavaScript and asynchronous communications, created an awareness of a set of technologies nicknamed AJAX (Asynchronous JavaScript and XML); that plus the advent of a number of libraries that paper over implementation differences between different web browsers has more recently led to a re-evaluation of JavaScript as a programming language.

JavaScript is a prototype-based object-oriented language, and as shown in Listing 3, also permits independent class definitions. It does not have strict typing like Java, making it extremely flexible and giving it qualities that in some senses could be considered similar to Lisp.

Firefox 3.5 includes a number of extensions to the specification standardized in ECMAScript 3rd Edition, and can use JavaScript 1.7 and JavaScript 1.8.

Listing 3: An example of a class definition in JavaScript

function MyClass() {
}
MyClass.prototype = {
  property1 : true,
  property2 : 'string',
  method : function() {
    alert('Hello, world!');
  }
};
var obj = new MyClass();
obj.method();

DOM: An API for manipulating XML documents

The Document Object Model (DOM) is a technical standard promulgated by the W3C, and is an API for manipulating the contents of XML documents as objects. In earlier dynamic HTML approaches, the typical method was to use the innerHTML property of the HTML element node to dynamically change the contents of the HTML document by manipulating strings, but using the DOM makes it possible to manipulate XML documents in a way that better matches JavaScript's object-oriented nature.

In addition, XUL lacks any equivalent for the innerHTML property, so if it weren’t for the DOM, dynamic processing would be impossible.

There are a number of levels to the DOM with different levels of functionality. Gecko supports nearly all of DOM Level 2 and some of DOM Level 3.

With the DOM, the contents of an XML document are handled as a "DOM tree," a collection of element nodes and other nodes. Listing 4 shows an example that deletes the second child element of the element with the "toolbar" id, adds a new button element as a substitute, and sets a label attribute.

We do not go into the details of the various APIs in the DOM. To learn more about the DOM, please take a look at the MDC documentation.

Listing 4: An example manipulation using the DOM

var bar = document.getElementById('toolbar');
bar.removeChild(bar.childNodes[1]);
bar.appendChild(document.createElement('button'));
bar.lastChild.setAttribute('label', 'Hello!');