Next: System Overview Up: Contents Previous: Abstract

1 Introduction

In distributed persistent storage systems, servers provide persistent storage for information accessed by applications running at clients [BOS91][LLOW91][WD94][C+89][LAC+96]. These systems cache recently used information at client machines to provide low access latency and good scalability.

This paper presents a new hybrid adaptive caching technique, HAC, which combines page and object caching to reduce the miss rate in client caches dramatically. The approach provides improved performance over earlier approaches - by more than an order of magnitude on common workloads. HAC was developed for use in a persistent object store but could be applied more generally, if provided information about object boundaries. For example, it could be used in managing a cache of file system data, if an application provided information about locations in a file that correspond to object boundaries. HAC could also be used within a server cache.

In persistent object systems, objects are clustered in fixed-size pages on disk, and pages are much larger than most objects [Sta84]. Most systems manage the client cache using page caching [SKW92][WD94][LLOW91]: when a miss occurs, the client evicts a page from its cache and fetches the missing object's page from the server. Page caching achieves low miss penalties because it is simple to fetch and replace fixed-size units; it can also achieve low miss rates provided clustering of objects into pages is good. However, it is not possible to have good clustering for all application access patterns [Day95][CK89][TN92]. Furthermore, access patterns may evolve over time, and reclustering will lag behind because effective clustering algorithms are very expensive [TN92] and are performed infrequently. Therefore, pages contain both hot objects, which are likely to be used by an application in the near future, and cold objects, which are not likely to be used soon. Bad clustering, i.e., a low fraction of hot objects per page, causes page caching to waste client cache space on cold objects that happen to reside in the same pages as hot objects.

Object-caching systems [KGBW90][WD92][KK90][C+94b][LAC+96][D+90][Ont92] allow clients to cache hot objects without caching their containing disk pages. Thus, object caching can achieve lower miss rates than page caching when clustering is bad. However, object caching has two problems: objects are variable-sized units, which leads to storage fragmentation, and there are many more objects than pages, which leads to high overhead for bookkeeping and for maintaining per-object usage statistics.

Dual buffering [KK94] partitions the client cache statically into a page cache and an object cache. This technique has been shown to achieve lower miss rates than both page caching and object caching when the fraction of space dedicated to the object cache is manually tuned to match the characteristics of each application. Such tuning is not feasible for real applications; furthermore, these systems do not solve the problems of storage fragmentation and high overheads in object caching systems.

HAC is the first technique to provide an adequate solution to all these problems. It is a hybrid between page and object caching that combines the virtues of each - low overheads and low miss rates - while avoiding their problems. Furthermore, HAC adaptively partitions the cache between objects and pages based on the current application behavior: pages in which locality is high remain intact, while for pages in which locality is poor, only hot objects are retained. To our knowledge, HAC is the first adaptive caching system. O'Toole and Shrira [OS95] present a simulation study of an adaptive caching system, but they focus on avoiding reads to install modifications at servers and ignore storage management issues.

HAC partitions the client cache into page frames and fetches entire pages from the server. To make room for an incoming page, HAC does the following:

selects some page frames for compaction,
discards the cold objects in these frames,
compacts the hot objects to free one of the frames.

The approach is illustrated in Figure 1. The policy to select page frames for compaction strives both to achieve low overhead and to discard objects that are unlikely to be reused. HAC maintains usage statistics on a per-object basis using a novel technique with low space and time overheads. The technique combines both recency and frequency information to make good replacement decisions.

Figure 1: Compaction of pages by HAC

HAC has been incorporated in Thor-1, a new implementation of the Thor object database [LAC+96]; it retains the server storage management and concurrency control techniques of our previous implementation, Thor-0, but uses HAC for client cache management. The high performance of Thor-1 is furthermore achieved by adherence to two design principles: think small, and be lazy. We were careful about the sizes of all data structures, but particularly about objects; we designed our object format parsimoniously, and the result is better performance in all parts of the system, from the server disk to the client cache. Furthermore, we do work only when absolutely necessary, since this allows us to avoid it entirely in many cases. We will explain how these principles were applied in our implementation and how they contribute to our performance.

The paper presents results of performance experiments that show the benefits of these techniques. The experiments evaluate their impact on performance across a wide range of cache sizes and workloads. Our results show that HAC consistently outperforms page-based approaches and achieves speedups of more than an order of magnitude on memory-bound workloads with typically achievable clustering. We also show that Thor-1 outperforms the best object storage systems.

The rest of the paper is organized as follows. Section 2 gives an overview of our implementation, and describes the object format at both clients and servers. Section 3 describes HAC. We present performance results in Section 4. Section 5 summarizes the main results in the paper.

Next: System Overview Up: Contents Previous: Abstract

Miguel Castro, Atul Adya, Barbara Liskov, and Andrew Myers