Documentation for BiDi Mozilla

This is preliminary documentation of the changes introduced to Mozilla as part of the BiDi support contributed by IBM (a.k.a. IBMBIDI), written by Simon Montagu and posted to the mozilla-layout mailing list. While it was published in 2001 and might not be totally accurate, it does help understanding the internals of the BiDi code.

Overview of BiDi processing

Bidi text is reordered according to the Unicode Bidi Algorithm (UBA). The implementation is based on IBM's International Components for Unicode (ICU), which was chosen after comparing and testing the available open-source implementations. As far as we could discover, ICU is the only one which is 100% compatible with UBA, including support for explicit directional controls (LRO, RLO, etc, and their HTML equivalents).

Bidi processing for a given HTML document will only take place if one of the following is true:

  • The page includes a Hebrew or Arabic character or a Hindi digit. This is determined in nsTextFragment::SetTo
  • The page includes a element with the attribute dir=rtl, either explicitly (nsGenericHTMLElement::MapCommonAttributesInto), or as a consequence of a style rule (MapDeclarationTextInto in nsCSSStyleRule.cpp)

All these cases use nsDocument::EnableBidi to set the flag mBidiEnabled. In a Bidi-enabled document, the following things happen:

  • During a reflow, nsBidiPresUtils::Resolve is called. This method uses the UBA to determine the directional properties of the text and reorder frames if necessary. If necessary, text frames are split so that every frame has the same directionality. FrameManager::SetFrameProperty is used to set the following flags and pointers (for terminology see the specification of the UBA):
    • embeddingLevel: the embedding level of the frame
    • textClass: the text class of the frame.
    • baseLevel: the base level (direction) of the paragraph.
    • nextBidi: when a frame has been split, this points to the next frame (in logical order). It is an nsContinuingTextFrame.
  • "Reordering" of frames is accomplished by setting the appropriate frame coordinates. The order of the frames in the content model is not affected, so frames that are adjacent in the content model can be far apart visually. A new frame iterator, nsVisualIterator (in nsFrameTraversal.cpp) provides visual frame navigation capability.
  • Details of rendering are dependent on user preferences and system capabilities. Where the system is capable of tasks such as reversing and shaping text, symmetric swapping, numeric translation, etc., no special text rendering is needed, though there may be a call to a native API to set the base text direction (for example SetTextAlign on Windows). For systems without Bidi capabilities, the methods in nsIUBidiUtils are used.
    Note that we are not affected by buggy Bidi implementations on specific platforms, since the platform never sees a text fragment with mixed directionality, and is not expected to do anything more complicated than displaying left-to-right text from left to right or right-to-left text from right to left.
  • In some circumstances, even on a platform with Bidi capability, the layout code has to reverse text fragments or to allow for the fact that they are displayed in reverse. In general, this happens whenever we are dealing with less than a whole frame. Examples of this are in nsTextFrame::PaintTextSlowly; nsTextFrame::PaintUnicodeText when a selection is displayed; nsTextFrame::GetPosition; nsTextFrame::GetPointFromOffset.
    Text in Visual mode must also be reversed before display on a Bidi platform.

Text fields and Composer

The specification of the Bidi changes to composer was posted in the editor and i18n newsgroups, and responses there were taken into account. The implementation is mostly in layout code, especially in nsSelection.cpp and nsCaret.cpp.

Other BiDi functionality

  • Clipboard: based on Bidi Options in Preferences, the Text Mode of the clipboard may be "Logical", "Visual" or "As Source".
    • In "As Source" mode, the text copied into the clipboard is exactly the same (from a Bidi point of view) as the original source. The text pasted from the clipboard (to the composer or to an edit field) is pasted as is.
    • In "Visual" mode, the text copied into the clipboard is exactly the displayed text. The text pasted from the clipboard is converted (if needed) so that Mozilla displays it (from a Bidi point of view) as it would be displayed by a visual clipboard viewer.
    • In "Logical" mode, the text copied into the clipboard is converted (if needed) so that a Logical application will display it (from a Bidi point of view) as it is displayed by Mozilla. Text pasted from the clipboard is treated exactly as if it came from a Logical source.
  • Form controls: based on Bidi Options in Preferences, the text mode of form controls may be "Logical", "Visual" or "Like Containing Document". We have also tested behaviour of all controls with dir=rtl and added support where necessary.
  • Some support added for alignment in tables and lists, and fixes for problems with different combinations of dir and align.
  • Improvements to lists with Hebrew and Arabic list-style-type

Summary of New Classes

Class Name XPCOM interface (if applicable) Implementation Comments
nsIBidi intl\unicharutil\public\nsIBidi.h intl\unicharutil\src\nsBidiImp.cpp Implementation of the Unicode Bidi algorithm
nsIUBidiUtils intl\unicharutil\public\nsIUBidiUtils.h intl\unicharutil\src\nsBidiUtilsImp.cpp Utilities for Bidi processing, including:
  • Character classification
  • Symmetric swapping
  • Reordering
  • Shaping
  • Numeric translation
  • Conversion to/from presentation forms
nsBidiPresUtils layout/base/nsBidiPresUtils.cpp Utilities for the layout engine including:
  • Resolve frames by Bidi level
  • Split frames
  • Reorder frames
  • Format Bidi text
  • Support for deletion and insertion of frames by editor
nsBidiTextFrame layout/generic/nsBidiFrames.cpp subclass of nsFrame with additional method SetOffsets, to adjust mContentOffset and mContentLength during Bidi processing
nsDirectionalFrame layout/generic/nsBidiFrames.cpp subclass of nsFrame
This is a special frame which represents a Bidi control. It is created when resolving text containing a Unicode Bidi control character, a BDO tag, or right-to-left alignment caused by a dir tag or CSS.
nsIBidiKeyboard widget/public/nsIBidiKeyboard.idl widget/src/%platform%/nsBidiKeyboard.cpp Sets and queries the directionality of the current keyboard language.