Method Signature Refinement

The most general subtype rules allow subtypes to widen argument types and narrow result types [Car84]. This process is sometimes called method signature refinement. Because implementing these rules efficiently is difficult, separately-compiled languages like C++ and Modula-3 have a more restrictive subtype rule: they require that argument and return types stay the same in subtype methods. However, the Theta implementation demonstrates that refined method signatures can be efficiently implemented even in separately-compiled languages.

The subtype rules are difficult to implement because objects have multiple views (see Section 2.3), a necessary feature of systems with fast dispatch and separate compilation, such as Theta or C++. Multiple views are also useful for treating primitive values (e.g., integers) as first-class objects in a system with a universal supertype.

To see why multiple object views make signature refinement problematic, consider two abstract types T and T_s, where T_s is a secondary supertype of T. An offset is required to convert between the T and T_s views. Suppose that the supertype T_s has a method identity that returns a T_s, but that in T, identity is declared to return a T. This hierarchy is sound, but creates practical implementation difficulties. Let x be a variable of declared type T_s that actually refers to a T object. Calling x.identity() yields a T, but the caller expects a T_s - a different view of the object.

The solution to the puzzle is straightforward: the compiler assigns multiple method indices to the method, one for every type with a distinct signature for that method. Let m_s be the method index of identity in T_s. Because T declares a new signature for identity, it introduces a new method index m for calls that use the new signature. The compiler sends method calls to the appropriate method index based on the static type of the receiver object, so the presence of two distinct dispatch entries is never visible to the programmer.

Classes that implement T_s do not need to know about T and only have a m_s entry, so the locality principle is preserved. Classes that implement T are really implementing the m version of the method, so the m entry points to the method code. The m_s entry points to an automatically generated trampoline procedure, whose code bumps any arguments and calls the T version of the method, then bumps the return values appropriately. This trampoline makes the implementation of the method look as if it had the T_s signature. On the MIPS R3000, the bumping of arguments and return values imposes a one cycle overhead per adjusted value, since all adjustments are constant additions. In many cases, a trampoline procedure would be required anyway to bump "self", and the two trampolines can be merged.

Note that a single dispatch vector can contain multiple versions of the same method, if no change of view is required among the types supporting the method, and its signature differs in them. This situation is not a problem, since the method index is chosen according to the static type of the receiver object.

This technique applies to almost any multiple-view system with inheritance, such as the bidirectional object layout. It would also allow efficient implementation of refined method signatures in C++ if the language permitted.

Next: Empirical Results Up: Bidirectional Object Layout for Previous: Dynamic Field Offsets