Next: Traversals With Updates Up: Performance Evaluation Previous: Miss Penalty

4.5 Overall Performance

The previous sections showed that HAC achieves low miss rates, hit times and miss penalties. This section shows the effect on overall performance by comparing elapsed times (not including commit time) for HAC and FPC. We are not able to compare HAC with QuickStore directly because QuickStore runs on different equipment. However, our results show that Thor-1 with HAC will outperform QuickStore on realistic workloads, as we argue below.

We present results for both the base version of HAC, which performs replacement in the background while waiting for replies to fetch requests, and HAC-FG, a version of HAC modified to perform replacement in the foreground. In all our experiments, the time to perform replacement was lower than the idle time during fetches and, therefore, HAC had zero replacement overhead. HAC-FG corresponds to the worst case replacement overhead, i.e., when there is no idle time between fetches because of multithreading or multiprogramming in the client machine.

We also present results for two versions of FPC that bound the range of performance we would expect from a fast page-caching system: the base version and FPC-NO, which is an unrealistic system that incurs no overheads. The elapsed times of FPC-NO were obtained by subtracting the replacement and conversion overheads from the elapsed time of the base version. Both versions use a previously collected execution trace to implement perfect LRU without any overhead.

We claim that any performance gains HAC or HAC-FG achieve relative to FPC-NO, when running traversals with equal or lower clustering quality than T1, would be even higher relative to QuickStore. This is true because FPC-NO has low hit time (as low as HAC's), no replacement and conversion overheads, and lower miss rates than QuickStore for traversals with equal or lower clustering quality than T1. In the analysis that follows, we concentrate on the worst-case comparison for HAC - HAC-FG vs. FPC-NO; the gains of HAC relative to a more realistic page-caching system would be higher.

 
Figure 12: Elapsed time, Hot traversals, Medium database


Figure 12 shows elapsed times we measured running hot T6, T1-, T1, and T1+ medium traversals. The performance of HAC-FG is worse than the performance of FPC-NO for the traversal T1+, because the miss rates of both systems are almost identical and FPC-NO has no overheads. However, the lines for the two versions of HAC are contained within the range defined by the two FPC lines. Therefore, the performance of HAC should be similar to the performance of realistic implementations of FPC even for this worst case.

For more realistic traversals, when the cache is very large, the performance of HAC-FG and FPC-NO is similar because the cache can hold all the pages touched by the traversal and the hit times of both systems are similar. When the cache is very small, FPC-NO can outperform HAC-FG because the miss rates of the two systems are similar and FPC-NO has no overheads, but this region is not interesting because both systems are thrashing. The gains of HAC-FG relative to FPC-NO are substantial in the middle region because it has much lower miss rates. The maximum performance difference between HAC-FG and FPC-NO occurs for the minimum cache size at which all the objects used by the traversal fit in the client cache; HAC-FG performs up to approximately 2400 times faster than FPC-NO in T6, 23 times faster in T1-, and 10 times faster in T1.

  
Figure 13: Elapsed time, Dynamic traversal

Figure 13 shows the elapsed times we measured for the dynamic traversal that executes 80% of the object accesses in T1- operations and 20% in T1 operations. HAC-FG performs up to 61% faster than FPC-NO in this workload.

We also compared the performance of Thor-1 with Thor-0, our previous implementation, which had been shown to outperform all other systems as long as the working set of a traversal fit in the client cache [LAC+96]. We found that HAC enabled Thor-1 to outperform Thor-0 on all workloads, and to do substantially better on traversals where the working set did not fit in the client cache.

Next: Traversals With Updates Up: Performance Evaluation Previous: Miss Penalty


Miguel Castro, Atul Adya, Barbara Liskov, and Andrew Myers

Copyright ©1997 by the Association for Computing Machinery