Exact Stack Rooting

This guide is intended for SpiderMonkey hackers! Users of the SpiderMonkey API will want to read the GC Rooting Guide instead.

Introduction

This guide explains the basics of interacting with the GC from SpiderMonkey. Since the GC may move GCThings, it very important that SpiderMonkey know about each and every GCPointer in the system. The method you should use to keep the GC up-to-date with this information will vary depending on where the GCPointer is being stored. GCPointers can be stored as a property of a GCThing, on the CHeap, or in the CStack. These three storage regions have different lifetime and overhead characteristics and thus require different management strategies for efficient space and CPU utilization. Since C++ does not reliably distinguish between storage classes, we have to put this burden on the user.

Storing a GCPointer in a GCThing

Storing your GCPointer inside of a GCThing that is already in the LiveSet is the easiest way to keep a GCThing to the LiveSet. GCPointers that reside in GCThings fall into one of these cases: storage in a normal property, storage in a reserved slot, or storage with JS_{Get|Set}Private. Do not use JS_SetPrivate. If you see a GCPointer stored as a private, please file a bug. The private field is frequently used to store things that are not GCPointers, so the GC cannot automatically handle this slot. This means it must be manually traced by the object's owner: this is both fragile and more expensive than using an extra reserved slot, or even just putting a new property on the object.

Class FooClass = {
    "FooPrototype",          /* name */
    JSCLASS_HAS_PRIVATE | JSCLASS_HAS_RESERVED_SLOTS(1), /* flags */
    JS_PropertyStub,         /* addProperty */
    JS_PropertyStub,         /* delProperty */
    ... etc ...
};
RootedObject obj(cx, NewObject(cx, &FooClass, NullPtr(), NullPtr(), &obj));
RootedValue v(cx, OBJECT_TO_VALUE(otherGCThing));
JSObject::SetReservedSlot(obj, 0, v);

Storing a GCPointer on the CHeap

TODO: Tracing and HeapPtr

Storing a GCPointer on the CStack

GCPointers stored on the CStack are special. Unlike other GCPointers, they are not traced as part of the GC. Instead, every pointer on the CStack should be considered part of the RootSet. To that end, we have a special "Rooting" mechanism for stack pointers that is very efficiently able to add and remove GCPointers to the RootedSet. The downside of this efficiency is that GCPointers must be added to and removed from this RootedSet tracking structure in LIFO order. For this reason it is highly recommended that this rooting mechanism only be used on the CStack.

js::Rooted<T>

typedef js::Rooted<T> RootedT; -> RootedObject, RootedString, RootedScript, etc.

Because of the LIFO restriction, using this interface manually is extremely cumbersome. Instead, SpiderMonkey has a convenient suite of C++ RAII classes to do this for you, called js::RootedT:

RootedString str1(cx, JS_ValueToString(cx, val));
RootedString str2(rt, JS_ValueToString(cx, val));

Note 1: C++ insists that an initializing assignment (e.g., the default constructor followed by operator=) must have a copy constructor available, even if it is not used. Since it is never correct to implicitly copy a js::RootedT (more on this in a second) we have deleted the copy constructor from these classes. Therefore, we are forced to force you to initialize with constructor syntax. Sorry.

Note 2: In C++, destructors are required to run in declaration order, nicely maintaining our LIFO property. However, this does not hold true for temporaries: they are undeclared, after all. Some compilers maintain LIFO order, some do not. Incorrect use of js::RootedT as a temporary will crash at run time.

There is also a static constructor method fromMarkedLocation() that creates a Handle from an arbitrary location. This is used to make Handles for things like scope chain objects, which aren't explicitly rooted themselves, but are always reachable from the implicit stack roots. Every use of these should be commented to explain why they are guaranteed to be rooted.

js::Handle<T>

typedef JS::Handle<T> js::HandleT; -> HandleObject, HandleString, HandleScript, etc.

Any single GCThing is likely to get passed through a fairly deep call-stack before getting used. Even though creating a new js::RootedT is cheap, it is silly to repeat this work for every new call frame. To solve this, we use the standard "handle" approach. A js::HandleT is a pointer to a js::RootedT. A js::RootedT will automatically convert to a js::HandleT. By using js::HandleT in the interface instead of direct GCPointers, we ensure that the GCThing is already rooted on some previous stack frame, freeing us from having to worry about rooting the GCThing for the duration of the call.

bool Hello(JSContext *cx, HandleObject foo)
{
    return Hello(cx, foo);
}

.....

void OtherFunction(JSContext *cx)
{
    ...
    RootedObject obj(cx, ...);
    Hello(cx, obj);
    ...
}

A secondary benefit: since all JS functions take js::HandleTs, code that does not root CStack GCPointers correctly will be much less likely to compile.

Warning: js::HandleT does not root objects itself! That is the job of js::RootedT.

Warning: Never give a function a js::HandleT return type: this can easily lead to a js::HandleT outliving its js::RootedT's lifetime.

js::MutableHandle<T>

typedef JS::MutableHandle<T> js::MutableHandleT; -> MutableHandleObject, MutableHandleString, etc.

The purpose of js::MutableHandleT is to allow out-parameters for rooted GCThings. This class solves two problems:

  1. C++ only allows one level of user-defined implicit type conversion. When passed to an out-param, &handle would convert correctly to js::HandleT*, but &local is a compile error.
  2. Even though js::HandleT is indirect, it implicitly behaves like the thing it is pointing to. This makes js::HandleT and js::RootedT behave the same, allowing us to seamlessly weave js::HandleTs into the JS API. The problem is that this requires us to overload operator->. If we have a method taking Handle<Value>*, then out->setObject(foo) will not work as expected.

To solve these problems we have the js::MutableHandleT class. js::MutableHandleT is a js::HandleT with a set method.

bool ReturnFoo(JSContext *cx, MutableHandleString out)
{
    out.set(JS_NewStringCopyZ(cx, "Foo"));
    return bool(out);
}

size_t GetLengthFoo(JSContext *cx)
{
    RootedString s(cx);
    if (ReturnFoo(cx, &s))
        return JS_GetStringLength(s);
    return size_t(-1);
}

All methods in the JS-API that return GCPointers have been changed to this out-param style. This prevents an entire class of bugs where return values are not rooted properly before the next JS-API call.

JS::NullPtr

NULL (or nullptr) does not implicitly convert to a js::HandleT. For places where you want to pass a NULL value into a function taking a js::HandleT, you can construct and pass a new NullPtr instance. You are free to use these as temporaries.

RootedObject obj(cx, JS_NewObject(cx, clasp, NullPtr(), NullPtr());

Common Pitfalls

The C++ type system allows us to eliminate the possibility of most common errors; however, there are still a few things that you can get wrong that the compiler cannot help you with. There is basically never a good reason to do any of these. If you think you do need to do one of these, ask on one of SpiderMonkey's support forums: maybe we've already solved your problem using a different mechanism.

  1. Storing a js::RootedT on the heap. It would be very easy to violate the LIFO constraint if you did this. Use a different rooting mechanism if you store a GCPointer on the heap. There are MOZ_STACK_CLASS annotations that will cause a separate static analysis to fail if these are detected.
  2. Storing a js::HandleT on the heap. It is very easy for the handle to outlive its root if you do this. Again, there are annotations with MOZ_STACK_CLASS.
  3. Returning a js::HandleT from a function. If you do this, a handle may outlive its root.

Definitions

  • GC - Acronym for Garbage Collection: specifically SpiderMonkey's method of automatically managing JavaScript program memory.
  • GCThing - Storage allocated by SpiderMonkey's allocator.
  • GCPointer - A raw pointer type or tagged pointer type that may refer to a GCThing (JSObject *, JSString *, Value, jsid, etc.)
  • GCHeap - The graph of GCThings allocated by SpiderMonkey's allocator. The goal of SpiderMonkey's GC is to partition this heap into regions reachable from the RootSet and not.
  • Root - A GCPointer held live because it is a member of the RootSet.
  • RootSet - This is a set of GCPointers that are de-facto alive, regardless of the GCThing's reachability. GCPointers must be explicitly added to and removed from the RootSet.
  • LiveSet - The set of all GCThings that are reachable from Roots.
  • DeadSet - The set of all GCThings that are not reachable from Roots.
  • CHeap - The C/C++ program heap: e.g., memory allocated by malloc/calloc/realloc.
  • CStack - The C/C++ program stack.
  • rooted - A GCThing in the LiveSet is said to be "Rooted" even if it is not a Root as such, but is only reachable from Roots. This traditional usage is ambiguous and this guide will use live or alive instead of rooted to avoid confusion.

Caveats

Exact Rooting Transition Period

Exact stack rooting is not currently enabled by default: we are still using conservative scanning. Until the system described above has 100% browser coverage, it is unsafe to enable it. In the meantime, the js::RootedT class holds an object live by storing its pointer on the stack so that the conservative scanner will find it. This distinction should not be noticable to the end user.

Performance Tweaking

In aggregate, exact and conservate stack rooting should have similar performance. The performance tradeoffs that these techniques make, however, are quite different. If the extra overhead of exact rooting does end up adding an unacceptable cost to a specific code path that is not mitigated by a faster GC, there are some tricks you can use to get better performance at the cost of more complex code.

  • Move js::RootedT declarations above loops. Modern C++ compilers are not smart enough to do LICM on js::RootedT, so forward declaring a single js::RootedT above the loop and re-using it on every iteration can save some cycles.
  • Raw pointers. If you are 100% sure that there is no way for SpiderMonkey to GC while the pointer is on the stack, this is an option. Note: SpiderMonkey can GC because of any error, GC because of timers, GC because we are low on memory, GC because of environment variables, GC because of cosmic rays, etc. This is not a terribly safe option for embedder code, so only consider this as a very last resort.