Introduction


Object-oriented databases (OODBs) provide a simple yet powerful programming model that allows applications to store objects reliably so that they can be used again later and shared with other applications. The database acts as an extension of an object-oriented programming language such as Java, allowing programs access to long-lived objects in a manner analogous to how they manipulate ordinary objects whose lifetime is determined by that of the program.

The objects stored in an OODB may live a long time and as a result there may be a need to upgrade them, that is, change their code and storage representation. An upgrade can improve an object's implementation, to make it run faster or to correct an error; extend the object's interface, e.g., by providing it with additional methods; or even change the interface in an incompatible way, so that the object no longer behaves as it used to, e.g., by removing one of its methods or redefining what a method does. Incompatible upgrades are probably not common but they can be important in the face of changing application requirements. But providing a satisfactory way of upgrading objects in an OODB has been a long-standing challenge.

We have developed a novel mechanism for upgrading objects in an OODB. The approach is object-oriented: an upgrade definition describes what to do with each class that is changing, by providing a replacement class and a transform function that is used to initialize the new form of the object using the object's current state. An upgrade is executed by transforming all objects belonging to classes that are being changed.

Some systems [1,6] stop application access to the OODB while the upgrade is performed. But such a stop-the-world approach can make the system unavailable to users for potentially long periods of time. The unavailability may not be a serious issue if the OODB is small, but if it is large (e.g., trillions of objects residing at thousands of servers), the time during which the system is unavailable to applications can be very long.

Our system avoids delaying applications by running the upgrade lazily. An object is transformed just before an application accesses it, and therefore applications that run after the upgrade starts never see non-upgraded objects. In spite of being lazy, our system provides good semantics by ensuring that when a transform function executes, it encounters object interfaces that existed when its upgrade started and states that satisfy its object's invariants. These guarantees make it easy for programmers to write transform functions and to reason about their correctness.

A lazy system might violate the semantics we wish to provide, because the work of doing an upgrade is interleaved with application accesses to stored objects. For example, a delayed transform function of upgrade U might access an object that has been modified by an application transaction that ran after upgrade U started, thus violating our semantics. Also, if the transform function of an object x accesses an object y that has already been transformed either within the same upgrade or a later one, y's interface may be different than expected (if the upgrade was incompatible).

Previous systems do not provide a satisfactory solution to these problems. Stop-the-world systems guarantee that applications and later upgrades cannot interfere with transform functions of an upgrade U, but they have difficulty ordering transform functions within the same upgrade. Some systems [8,2,4] avoid problems by severely limiting the expressive power of transform functions, not allowing them to make any method calls. Others (e.g., [1]) make the execution of transform functions order-independent by maintaining two copies of the database during the upgrade. The transform functions initialize the new copy of the database using the old copy; when the upgrade is complete, the new copy replaces the old one. Neither of these approaches is desirable. The loss of expressive power means that many upgrades cannot be expressed using the mechanism, and the two-copy approach consumes huge amounts of space.

We have developed a lazy upgrade mechanism that supports expressive transform functions and avoids database copies. Our approach is efficient, yet it provides good semantics. For many upgrades, our system ensures statically that transform functions run before any objects they access are modified---either by applications or other transform functions. We also outline solutions for the remaining cases. Our approach is based on the observation that in most cases, a transform function of object x only accesses x and subobjects encapsulated within x. Ownership types provide a way of statically enforcing object encapsulation. We use a variant of ownership types to enforce our mechanism.

We have implemented our approach in the Thor OODB [5,3] and used the implementation to run experiments. Our results show that our infrastructure has low cost, e.g., it has negligible impact on applications that don't use any objects requiring upgrades, which we expect to be the common case, since upgrades are likely to be rare (e.g., no more frequent than once a week or once a day). The results also show that the slowdown when upgrades are needed is small.

More details on the experimental results can be found here.


[1] M. P. Atkinson, M. Dmitriev, C. Hamilton, and T. Printezis. Scalable and Recoverable Implementation of Object Evolution for the PJama 1 Platform. In Persistent Object Systems (POS), September 2000.

[2] J. Banerjee, W. Kim, H. Kim, and H. F. Korth. Semantics and implementation of schema evolution in object-oriented databases. In ACM SIGMOD International Conference on Management of Data, May 1987.

[3] C. Boyapati. JPS: A distributed persistent Java system. SM thesis, Massachusetts Institute of Technology, September 1998.

[4] B. S. Lerner and A. N. Habermann. Beyond schema evolution to database reorganization. In Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), October 1990.

[5] B. Liskov, M. Castro, L. Shrira, and A. Adya. Providing persistent objects in distributed systems. In European Conference for Object-Oriented Programming (ECOOP), June 1999.

[6] Object Design Inc. ObjectStore Advanced C++ API User Guide Release 5.1, 1997.

[7] Objectivity Inc. Objectivity Technical Overview, Version 6.0, 2001.

[8] D. J. Penney and J. Stein. Class modification in the GemStone object-oriented DBMS. In Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), October 1987.

[9] Versant Object Technology. Versant User Manual, 1992.

[10] R. Zicari. A framework for schema updates in an object-oriented database systems. In International Conference on Data Engineering (ICDE), April 1991.