The Third Cache-Off

The Official Report

October 11, 2000

The Measurement Factory, Inc.
Alex Rousskov, Duane Wessels
polyteam@measurement-factory.com

We held a week-long benchmarking ``cache-off'' for Web proxy caches in the middle of September, 2000. Using the Web Polygraph benchmark, we tested 31 proxy caches from 15 different organizations. In this report, we summarize performance data collected during these tests and analyze the results.

Table of Contents

1. Introduction
    1.1 Timeline
    1.2 Participants
    1.3 Terminology
    1.4 How (not) to Read this Report
    1.5 Where to find more information
2. Executive Summary
    2.1 Unavailable Data
3. The Rules
4. Web Polygraph
    4.1 The Cache-off Workload: PolyMix-3
5. Benchmarking Environment
    5.1 Location
    5.2 Schedule
    5.3 Polygraph Machines
    5.4 Time Synchronization
    5.5 Network Configurations
    5.6 Numbers
6. Test Sequence
    6.1 PolyMix-3
    6.2 Downtime Test
    6.3 MSL Test
7. Performance Details
    7.1 Normalized Throughput
    7.2 Hit Ratio
    7.3 Response Time
    7.4 Downtime test
8. Product Configurations
9. Cache-Off Controversies
    9.1 Software Licensing Costs
    9.2 Preparedness
    9.3 Dummynet Configuration
10. Comments
    10.1 Polyteam Comments
    10.2 Vendor Comments

1. Introduction

The Cache-off addresses a need in the web caching community for high quality, independent verification of product performance. This event represents a snapshot of the caching industry. That is, the results presented here are all taken during a one-week period. The performance of individual products does change over time.

We strive for fairness in our testing. Decisions regarding the rules and testing environment are made with input from cache-off participants. Any company or organization that wants to test the performance of their product(s) is given the opportunity to participate in this event. We describe the actual rules and testing environment later in this report.

1.1 Timeline

Preparations for the third Cache-off began with an organizational meeting in April, 2000. Representatives from 11 companies attended this meeting with the intention to participate in the cache-off. During the meeting we prioritized features for PolyMix-3 and agreed upon the following schedule:

May 15 Feature set and workload specifications finalized
June 19 Preliminary (beta) software made available to the public
July 10 Preregistration deadline
July 24 Final code released
Sept 18 Testing began

1.2 Participants

The following companies and organizations brought products to the third cache-off:

Even though they registered, Cisco decided, at the last minute, not to participate.

1.3 Terminology

Throughout this report we use a few terms that have specific meaning for the cache-off. A vendor is an organization that has a caching product. To simplify the terminology, all commercial, non-profit, virtual, etc. organizations are labeled as ``vendors.'' Some vendors are actually two or more companies working together, under an O.E.M. agreement for example. A vendor is allowed to bring more than one product to the cache-off. Each product or entry that a vendor brings counts as one participant. We have one bench (``harness'') for every participant. PolyMix-3 is the name of the workload that we used for these tests. As with previous tests, this is a standardized workload that we developed in cooperation with vendors. The details of PolyMix-3 are covered later in this report.

1.4 How (not) to Read this Report

We strongly caution against drawing hasty conclusions from these benchmarking results. Our report contains a lot of performance numbers and configuration information; take advantage of it. Since the tested caches differ a lot, it is tempting to draw conclusions about participants based on a single performance graph or a column in a summary table. We believe such conclusions will virtually always be wrong. Here are a few recommendations to prevent misinterpretation of the results.

  1. Always read the Polyteam and Vendor Comments sections.
  2. Compare several performance factors: throughput, response time, hit ratio, etc. Weigh each factor based on your preferences.
  3. Do not overlook pricing information and price/performance analysis.

Our benchmark addresses only the performance aspects of Web cache products. Any given cache will have numerous features that are not addressed here. For example, we think that manageability, reliability, and correctness are very important attributes that should be considered in any buying decisions.

1.5 Where to find more information

The Measurement Factory maintains the Official Results Site, where this report and detailed Polygraph log files from the cache-off are stored. All information at the Official Results Site is freely available.

There are no other official sources of cache-off results.

Documentation, sources, independent test results, discussion mailing lists, and other information related to Polygraph benchmark are available at the Web Polygraph site.

Only major performance measurements are discussed in this report. For more information, consider these sources:

The links above are also useful if you are afraid of being influenced by our interpretation of the results. We still recommend reading the report afterwards (as a ``second opinion'') because not all test rules and performance matters can be clear from the raw data.

2. Executive Summary

The ``Executive Summary'' table below summarizes the performance results. We provide an in-depth analysis of the measurements in section ``Performance Details.''

Product Total
Price

(US$)
Peak
Tput

(req/sec)
Response Time
(sec)
Savings
(%)
$1,000
can buy
Minutes
Till First
Cache
Age

(hour)
Hit All Miss Doc Time hit/sec req/sec Miss Hit
Aratech-200013,400 800 0.22 1.55 3.06 56.1 44.8 33 60 2.8 2.8 23.7
CinTel-iCache10,070 850 0.05 1.39 2.84 54.7 50.3 46 85 3.5 3.5 18.2
Compaq-b17n/a 300 0.16 1.45 2.77 53.5 48.4 n/a n/a 3.4 3.4 10.2
Compaq-C250071,995 2400 0.08 1.41 2.77 53.4 49.8 18 33 6.6 6.6 10.0
Dell-1002,991 300 0.06 1.70 2.78 41.8 39.2 42 100 2.8 3.4 3.9
Dell-200x464,390 3310 0.42 1.66 2.87 52.2 40.5 27 51 2.3 5.9 8.4
F5-EDGE-FX10,099 800 0.19 1.43 2.77 54.8 49.0 43 79 3.2 3.2 19.1
IBM-220-14,227 450 0.33 1.61 2.77 50.3 42.4 54 105 3.1 3.7 7.4
IBM-220-25,975 500 0.04 1.75 2.77 39.6 37.6 33 84 2.9 3.1 3.7
IBM-23025,291 1400 0.03 1.99 2.80 30.8 29.0 17 55 n/a n/a 2.1
IBM-3307,247 350 0.04 1.84 2.78 36.3 34.4 18 48 3.2 4.0 3.0
iMimic-13006,495 780 0.08 1.37 2.80 55.4 51.2 67 120 1.5 2.9 23.3
iMimic-240010,095 720 0.10 1.59 3.16 54.1 43.2 39 71 7.1 7.1 12.8
iMimic-260020,295 1903 0.04 1.37 2.78 54.1 51.0 51 94 4.3 4.4 14.5
iMimic-Alpha31,295 2140 0.03 1.39 2.87 54.9 50.5 38 68 6.4 6.4 19.9
Lucent-5010,123 200 0.03 1.36 2.79 54.4 51.4 11 20 1.3 1.3 24.0
Lucent-10015,123 550 0.05 1.43 2.79 52.2 48.9 19 36 1.4 1.5 11.7
Lucent-100z20,123 983 0.09 1.51 2.90 52.0 46.1 25 49 2.3 3.0 11.0
Microbits-C5,500 570 0.45 1.98 2.78 36.8 29.5 38 103 2.3 2.3 3.1
Microbits-P2,200 115 0.44 1.94 2.85 40.1 30.7 21 52 1.8 1.8 3.5
Microsoft-150,481 2083 0.13 1.47 2.95 55.2 47.4 23 41 5.8 5.8 17.1
Microsoft-24,982 720 0.26 1.63 3.06 53.9 41.9 78 145 1.4 1.7 10.1
NAIST-14,751 n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a
NAIST-214,836 n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a
NetApp-C110511,150 375 0.08 1.47 2.87 53.3 47.4 18 34 1.9 5.6 21.9
NetApp-C6100101,700 2100 0.05 1.44 2.88 53.8 48.7 11 21 2.2 6.2 13.9
Squid-2.4.D44,108 130 0.50 1.94 3.51 54.8 30.9 18 32 3.7 8.1 24.4
Stratacache-D784 120 0.60 1.64 2.79 56.0 41.6 n/a n/a 2.5 2.8 21.2
Stratacache-E3,380 225 0.13 1.53 2.86 51.1 45.2 34 67 n/a n/a 8.1
Stratacache-F7,160 450 0.11 1.50 2.77 50.5 46.6 32 63 2.4 2.4 8.3
Swell-14502,774 120 0.40 1.62 2.97 55.2 42.1 24 43 1.4 1.4 13.6

Column headings in the above table have links to bar charts that compare the corresponding measurement. Row headings are labels with short names for the tested products. These labels also have links to pages with configuration and performance details for each product.

The ``Total Price'' column is a sum of the list price of all caching hardware, software, and the cost of networking gear (switches, routers) that the vendor used. We include networking equipment costs in the price/performance analysis to adjust for high-end devices that might be used to achieve higher performance and/or aggregate individual caches into clusters. Readers who already have the required network components in place can adjust the price accordingly and make their own performance/price calculations.

NOTE: It is likely that some vendors may lower their prices after seeing their competitor's results. Be sure to read the ``Vendor Comments'' for pricing changes and other important information.

The ``Peak Throughput'' column depicts the highest tested request rate for each product. PolyMix-3 has a number of different phases, each with a different, or varying throughput. In the summary table, we report the response rate during the 4 hour top2 phase, when the load is at its peak.

The ``Response Time'' group has three columns. We report the mean response time for cache hits and misses separately to emphasize performance differences on the two most important request paths. The ``All'' column depicts mean response time for all request classes.

The ``Hit Ratio Savings'' column shows the percentage of requests that the product served as cache hits.

The ``Time Savings'' column shows the mean response time improvement compared to a no-proxy case.

The ``$1000 can buy'' columns shows performance/price ratios. Two performance measurements are used: hit rate or number of hits per second (the ``hits/sec'' column) and request rate (the ``req/sec'' column). Both measurements are normalized by total product price (in thousands of dollars). In other words, the data shows ``how much hit rate or throughput one thousand dollars can buy?'' Some participants feel that hit performance/price is a more important measurement than overall throughput. For example, a product with poor hit ratio may still score well on overall throughput. On the other hand, the hit rate measurement can be misleading because the hit ratio you achieve on a production system may be significantly different than for these tests.

The ``Minutes Till First'' columns contain the results of the downtime test. Here we report how long it takes the product to serve a cache miss and a cache hit after suffering a power outage.

The ``Cache Age'' column estimates the cache capacity in terms of hours of peak fill (i.e., cachable miss) traffic. For example, a reading of 10 means that, at the peak request rate, the cache becomes full after 10 hours, and must begin replacing objects. We believe that, in a production environment, caches should be large enough to hold 2-3 days worth of traffic. Unfortunately, in this cache-off environment, products can get away with a cache age of 4-5 hours -- just enough to store the working set.

All published performance tests finished with less than 0.1% of failed transactions. Note that the rules disqualify a run with more than 3.0% of errors.

2.1 Unavailable Data

You will notice that some of the table entries are filled with ``n/a'' to indicate unavailable data.

Compaq was unwilling to provide pricing for their Compaq-b17 entry. Thus, we can not calculate performance/price metrics. Compaq informed us that they changed their mind and do not intend to offer this product for sale within the three months required by our rules.

The IBM-230 entry was unable to complete the downtime test. The system requires manual intervention to boot up. IBM informed us that the product is designed to never lose power. It has triple-redundant power supplies.

The NAIST entries did not complete any of the tests successfully.

Performance/price numbers are missing from the Stratacache-D entry because of the pricing controversy described in detail later in this report. The given price for the Stratacache Dart is with a ten user license. The product was tested at a throughput (120 req/sec) that is much higher than ten users can realistically generate. For those who are interested, the missing performance/price numbers can easily be calculated from the price, throughput, and hit ratio measurements which are available for the Stratacache Dart.

The Stratacache-E entry was unable to complete the downtime test. The system's BIOS requires manual intervention to boot.

3. The Rules

The majority of the rules were defined and agreed upon in conjunction with participants during the April organizational meeting in Boulder, Colorado. Most of the rules are the same as from previous testing events. The core set of rules are available in the rules document. Here, we will highlight some of the more interesting and controversial provisions.

Product Availability

Tested products must be available for sale to the public at the prices given in this report within three months after the end of testing. This rule is difficult, if not impossible to enforce. Nonetheless, if you discover that one of the products described here is not being offered, please let us know.

TCP Maximum Segment Lifetime

Tested products must use a TCP MSL value of at least 30 seconds. A lower setting can improve performance by recycling recently-used TCP ports at a higher rate.

Minimum Hit Ratio

At least 25% of responses must be cache hits during the top2 phase. Some products may be able to trade lower hit ratio for higher throughput. PolyMix-3 results with document hit ratio less than 25% are disqualified.

Entry Limits

There is no limit on the number of products that a single vendor can bring to be tested.

Publication of Results after the Cache-off

Companies that participate in the cache-off can publish new results two months after this report is released. This rule is in place to prevent a company from testing a ``token'' product at the cache-off and then publishing results for different products right after finding out how their competition performs.

Non-participating companies must wait three months. The rationale for this rule is similar. Some people feel that a company has an advantage if they can skip the cache-off, learn about their competition, and then publish a result from the same test shortly after.

No new PolyMix-3 results can be published in the two months before the next cache-off. Since the PolyMix workloads become increasingly difficult with time, we try to prevent someone from publishing an old, easier result around the same time that other companies are publishing results from new, harder workloads.

Referencing Cache-off Results

Any work that is derived from, or uses any of the cache-off results, or this report, must include the following reference to our official site:

A. Rousskov and D. Wessels, The Third Cache-off. Raw data and independent analysis at <http://www.measurement-factory.com/results/>.

4. Web Polygraph

Web Polygraph is a high-performance proxy benchmark. Polygraph is capable of generating a whole spectrum of Web proxy workloads that either approximate real-world traffic patterns, or are designed to stress a particular proxy component. Developed with the cache-off needs in mind, Polygraph is able to generate complex, high request rate workloads with negligible overhead. Web Polygraph has been successfully used to debug, tune, and benchmark many caching products.

The Polygraph distribution includes two programs: polyclt and polysrv. Poly-client (-server) emits a stream of HTTP requests (responses) with given properties. The requested resources are called objects. URLs generated by Poly-client are built around object identifiers or oids. In short, oids determine many properties of the corresponding response, including response content length and cachability. These properties are usually preserved for a given object. For example, the response for an object with a given oid will have the same content length and cachability status regardless of the number of earlier requests for that object.

As it runs, Polygraph collects and stores many statistics, including: response rate, response time and size histograms, achieved hit ratio, and number of transaction errors. Some measurements are aggregated at five second intervals, while others are aggregated over the duration of the whole phase.

For the cache-off tests, we used version 2.5.4 of Web Polygraph. Web Polygraph is available to anyone at no charge in source code format.

4.1 The Cache-off Workload: PolyMix-3

The PolyMix environment has been modeling the following Web traffic characteristics since PolyMix-2:

These features were added for PolyMix-3:

Still absent from the cache-off workload are:

While the last four features are already supported in a Polygraph environment, they are prohibitively CPU intensive or require further improvement.

As mentioned previously, PolyMix-3 combines the fill and measurement phases into a single workload. The benefit to this approach is that the device under test is more likely to have steady state conditions during the measurement phases. Also, a larger URL working set can now be formed without increasing the duration of a test. Under PolyMix-2, the fill phase was an isolated test. That meant that the measurement phase could not request objects used during the fill phase.

A downside to integrating the fill phase is that it is now difficult to skip the fill phase and go right to measuring. For some products, half of the testing time is spent in the fill phase. The total duration of the test remains similar to the PolyFill-2 plus PolyMix-2 sequence, decreasing for some products.

The following table describes all the important phases in a PolyMix-3 test. Not counting the fill phase, the test takes about 12 hours. Filling the cache usually takes an additional 3-12 hours, depending on the product.

Phase
Name
Duration Activity
framp 30 min The load is increased from zero to the peak fill rate.
fill variable The cache is filled twice, and the working set size is frozen.
fexit 30 min The load is decreased to 10% of the peak fill rate. At the same time, recurrence is increased from 5% DHR to its maximum level.
inc1 30 min The load is increased during the first hour to reach its peak level.
top1 4 hours The period of peak ``daily'' load.
dec1 30 min The load steadily goes down, reaching a period of relatively low load.
idle 30 min The ``idle'' period with load level around 10% of the peak request rate.
inc2 30 min The load is increased to reach its peak level again.
top2 4 hours The second period of peak ``daily'' load.
dec2 30 min The load steadily goes down to zero.

The old PolyFill-2 workload used Polygraph's best-effort request submission model, and vendors could choose how many robots to use for the fill. Some participants apparently found that a small number of robots left the disk system in a higher performing state than did a larger number. Now, PolyMix-3 uses the same number of robots during all of its phases, and the participants can specify fill rate directly just as they specify the peak request rate.

One of the rules of PolyMix-3 is that the request rate during the fill phase must not be greater than the peak rate (as used in top1 and top2). Otherwise, participants are free to choose virtually any fill rate the like. Usually, the selected fill rate is at least 50% of th peak request rate. We do not present the fill rate parameters in this report, but they can be derived from the logs.

Most measurements discussed in this report are taken from the top2 phase when the proxy is more likely to be in a steady state.

Reply Sizes

Object reply size distributions are different for different content types (see the table below). Reply sizes range from 300 bytes to 5 MB with an overall mean of about 11 KB and a median of 5 KB. The reply size depends only on the oid. Thus, the same object always has the same reply size, regardless of the number of requests for that object.

Cachable and Uncachable Replies

Polygraph servers mark some of their responses as uncachable. The particular probability varies with content types (see the table below). Overall, the workload results in about 80% of all responses being cachable. The real world cachability varies from location to location. We have chosen 80% as a typical value that is close to many common environments.

A cachable response includes the following HTTP header field:

	Cache-Control: public

An uncachable response includes the following HTTP header fields:

	Cache-Control: private,no-cache
	Pragma: no-cache

Object cachability depends only on the oid. The same oid is always cachable, or always uncachable.

Life-cycle model

Web Polygraph is capable of simulating realistic (complex) object expiration and modification conditions using Expires: and Last-Modified: HTTP headers. Each object is assigned a ``birthday'' time. An object goes through modification cycles of a given length. Modification and expiration times are randomly selected within each cycle. The corresponding parameters for the model are drawn from the user-specified distributions.

The Life-cycle model configuration in PolyMix-3 does not utilize all the available features. We restrict the settings to reduce the possibility that a cache serves a stale response. While stale objects are common in real traffic, caching vendors strongly believe that allowing them into the benchmark sends the wrong message to buyers.

Consecutively, all Polygraph responses in PolyMix-3 carry modification and expiration information, and that information is correct. The real-world settings would be significantly different, but it is difficult to accurately estimate the influence of these settings on cache performance.

Content Types

PolyMix-3 defines a mixture of content types. Each content type has the following properties:

The approximate parameters for the first four properties are given in the table below. For exact definitions, see the workload files.

Type Percentage Reply Size Cachability Expiration
Image 65.0% exp(4.5KB) 80% logn(30day, 7day)
HTML 15.0% exp(8.5KB) 90% logn(7day, 1day)
Download 0.5% logn(300KB,300KB) 95% logn(0.5year, 30day)
Other 19.5% logn(25KB,10KB) 72% unif(1day, 1year)
Latency and Packet Loss

PolyMix-3 uses the same latency and packet loss parameters that we used for PolyMix-2. The Polygraph client and server machines are configured to use FreeBSD's DummyNet feature.

We configure Polygraph servers with 40 millisecond delays (per packet, incoming and outgoing), and with a 0.05% probability of dropping a packet. Server think times are normally distributed with a 2.5 second mean and a 1 second standard deviation. Note that the server think time does not depend on the oid. Instead, it is randomly chosen for every request.

We do not use packet delays or packet loss on Polygraph clients.

If-Modified-Since Requests
Cache Hits and Misses

PolyMix-3 workload has a 58% offered hit ratio. In the workload definition, this is actually specified through the recurrence ratio (i.e., the probability of revisiting a Web object). The recurrence ratio must account for uncachable responses and special requests. In PolyMix-3, a recurrence ratio of 72% yields an offered hit ratio of 58%. Note that to simplify analysis, only ``basic'' requests are counted when hit ratio is computed; special requests (If-Modified-Since and Reload) are ignored because in many cases there is no reliable way to detect whether the response' was served as a cache hit.

Polygraph enforces the desired hit ratio by requesting objects that have been requested before, and should have been cached. There is no guarantee, however, that the object is in the cache. Thus, our parameter (58%) is an upper limit. The hit ratio achieved by a proxy may be lower if a proxy does not cache some cachable objects, or purges previously cached objects before the latter are revisited. Various HTTP race conditions also make it difficult, if not impractical, to achieve ideal hit ratios.

Object Popularity

PolyMix-3 introduces a ``hot subset'' simulation into the popularity model. At any given time, a 1% subset of the URL working set is dedicated to receive 10% of all requests. As the working set slides with time, the hot subset may jump to a new location so that all hot objects stay within the working set. This model is designed to simulate realistic Internet conditions, including ``flash crowds.'' We have not yet fully analyzed the effect of this hot subset model.

Simulated Robots and Servers

A single Polygraph client machine supports many simulated robots. A robot can emulate various types of Web clients, from a human surfer to a busy peer cache. All robots in PolyMix-3 are configured identically, except that each has its own IP address. We limit the number of robots (and hence IP aliases) to 1000 per client machine.

A PolyMix-3 robot requests objects using a Poisson-like stream, except for embedded objects (images on HTML pages) that are requested simulating cache-less browser behavior. A limit on the number of simultaneously open connections is also supported, and may affect the request stream.

PolyMix-3 servers are configured identically, except that each has its own IP address.

Persistent Connections

Polygraph supports persistent connections on both client and server sides. PolyMix-3 robots close an ``active'' persistent connection right receiving the N-th reply, where N is drawn from a Zipf(64) distribution. The robots will close an ``idle'' persistent connection if the per-robot connection limit has been reached and connections to other servers must be opened. The latter mimics browser behavior.

>PolyMix-3 servers use a Zipf(16) distribution to close active connections. The servers also timeout idle persistent connection after 15 sec of inactivity, just like many real servers would do.

Other details

A detailed treatment of many PolyMix-3 features is available on the Polygraph Web site, along with the copies of workload configuration files.

5. Benchmarking Environment

5.1 Location

Compaq Computer Corporation provided facilities near Houston, Texas for the Cache-off. We are very grateful to Compaq for their willingness to host this event (again!) and for their hospitality. They provided us with more than enough room to work and play. However, some participants seemed disappointed about not being frisked every day and the lack of restroom escorts. Superb logistic coverage provided by Henry Guillen at Compaq helped us to run the cache-off smoothly.

5.2 Schedule

Testing took place from Monday, September 18 to Friday, September 22. As described in the rules, participants are guaranteed at least 55 hours of testing time. Vendors had access to the cache-off facility from 9 AM until 8 PM each day. Tests were often queued to run overnight.

5.3 Polygraph Machines

We rented 250 PC's for use as Polygraph clients and servers. These machines are Compaq Deskpro EN systems, each with a 500 MHz Pentium III CPU, 256 MB of RAM, an Intel Etherexpress PRO/100+ fast ethernet card, and an IDE disk.

We use FreeBSD-3.4 as the base operating system for the Polygraph clients and servers. We make a number of changes to kernel parameters in order to support PolyMix-3. We provide participants with a custom-built FreeBSD distribution to simplify the installation process for them and reduce the chance of configuration mistakes. This software is available to the public from our web page.

The number of Polygraph machines varies for product under test. Peak request rates vary a lot among caching products. Thus, each participant informed us how many Polygraph client-server pairs they need to drive their cache at its maximum capacity.

During the cache-off, we never use more than 400 requests per second per machine for official tests.

Each bench also has a monitoring PC connected to the harness network. That PC is used to start Polygraph runs, display run-time statistics, collect logs after the completion of a run, and generate Polygraph reports.

5.4 Time Synchronization

We run the xntpd time server on all Polygraph machines and the monitoring PCs. The monitoring PCs are synchronized periodically with a designated reference clock. We run xntpd on all machines rather than just synchronizing clients and servers before each test as was done during the first cache-off. While running xntpd could introduce small CPU overhead, we are concerned that without periodic synchronization, local clocks may drift apart during these long (15+ hours) tests.

5.5 Network Configurations

Each test bench consists of Polygraph machines, the monitoring PC, the participants proxy cache(s), and a network to tie them together. The networking equipment falls under the participant's domain. That is, each participant is responsible for providing the networking equipment need to connect the Polygraph machines to the caches. Furthermore, the networking equipment that the participant brings contributes to their costs in the price/performance results.

Participants must choose between a flat or routed network configuration. In the flat network, all systems are on a single IP subnet and use 255.255.0.0 as the netmask. While the flat network is easy to configure, it does not represent reality very well. Real origin servers are not on the same subnet as the caching proxy. The flat network configuration results in a lot of ARP traffic.

The routed network configuration uses two subnets. The clients, proxies, and monitoring PC use one subnet, while servers use the other. Proxies are allowed, but not required, to have an interface on the server subnet as well. Some device must provide IP routing between the two subnets. That device may in fact be the caching proxy. The routed network is somewhat more complicated to set up, but does reduce ARP traffic.

During the cache-off, only two products used a routed network configuration (iMimic-1300 and Dell-200x4). All of the others used a flat network.

The following figure shows a typical bench configuration, with a flat network:

Each Polygraph machine requires a fast ethernet port, so the participant must have enough ports to connect all of Polygraph machines within participant's cluster. The monitoring PC must have IP connectivity to all clients and servers at all times.

We run bidirectional netperf tests between each client-server pair to measure the raw TCP throughput. We also selectively execute Polygraph ``no-proxy'' runs to ensure that clusters can generate enough throughput to sufficiently drive the cache under test.

Before running any PolyMix-3 tests, we always test network throughput with Netperf. All netperf tests showed satisfactory levels of raw network performance.

All ``no-proxy'' tests were successful, delivering desired throughput and negligible response time overheads.

5.6 Numbers

PCs rented: 255 + 5% spares
Vendors: 15
Humans: 35
Products tested: 31
Donuts consumed: 15 dozen
Floor space: 5600 ft2, plus storage
20-amp power circuits: 40
6-outlet power strips: 95
Ethernet cables: approx 300

6. Test Sequence

This section describes the official testing sequence. The complete sequence was executed at least once against all cache-off entries.

6.1 PolyMix-3

PolyMix-3 is the main performance test which generates the vast majority of the reported numbers. This test is discussed in the ``Cache-off Workload'' Section.

Note that the cache is filled as a part of the PolyMix-3 workload. Depending on the product, this can take anywhere from four to 20 hours.

6.2 Downtime Test

The ``Downtime Test'' is performed only after a successful PolyMix-3 run. We use the downtime-3.pg workload and one client-server pair. During the first 10 minutes of the test, Polygraph creates a 3 req/sec load through the proxy. The power to all participant devices, including networking equipment, is then manually turned off. After about 5 seconds of ``downtime,'' the power is turned back on, and the measurement phase begins. We measure the times until the first miss and the first hit. Polygraph continues to emit 3 requests per second during the entire test. The precision of this test is around 5 seconds.

It is important to note that the cache(s) and networking gear are plugged into power strips. We turn off the power strips and not the equipment boxes to simulate realistic conditions of an unexpected software, hardware, or power failure. Vendors are not allowed to assist the reboot process. UPS devices of any kind are not allowed during this test.

We realize that the downtime test and execution rules are simple, if not primitive. However, even this test provides very useful data to cache administrators. Depending on the installation environment and reported cache performance, one can decide whether to invest in UPS systems and/or redundant configurations. We will work on improving the workload for this test.

6.3 MSL Test

As we described earlier, all entries must have a Maximum Segment Lifetime (MSL) of 30 seconds, producing a TIME-WAIT state of 60 seconds. Any product which fails this test is disqualified.

To determine the MSL on each product, we probe its TCP stack and monitor connection requests. If the system accepts a new connection with the same sequence number in under 60 seconds, it fails the test. The msl_test program is included in the Polygraph source distribution.

All cache-off entries reported TIME-WAIT state of 60 seconds.

7. Performance Details

This section gives a detailed analysis of major performance measurements. The PolyMix-3 workload has several phases. For the baseline presentation, we have selected the top2 phase. Top2 is the second 4hour phase with peak request rate. The first peak phase, top1, often yields unstable results, the second top phase is usually more stable.

The bar charts below are based on data averaged across the top2 phase. Averages are meaningful in situations where performance does not change with time, or when changes are smooth and predictable. We encourage the reader to check individual entry report cards for the exceptional behavior where averages may be less meaningful.

As with any benchmark, Polygraph introduces its own overheads and measurement errors. We believe that margin of error for most results discussed here is within 10%. In most cases, however, the reader should pay attention to patterns and relative differences in product performance rather than absolute figures.

Depending on the version of the report you have selected, the entries are charted in the alphabetical order or in the order of the corresponding measurement.

7.1 Normalized Throughput

Presenting throughput results is a tedious task. Due to tremendous differences in request rates, a simple graph with raw request rates from the ``Executive Summary'' table is not very informative. Moreover, comparing throughput of a $65K four-head cluster with a small $5K PC is usually not interesting. Product prices do vary a lot.

To begin your analysis, you might first pick out products that are in your price range:

Product Prices

You may also want to pick out products that meet your demands for HTTP traffic:

Raw Throughput

We next normalize the throughput results by some universal measure of product complexity and ability. Several measures have been proposed, including product price and rack space. While rack space normalization is an interesting idea, it presents problems for entries that were not tested in a rack-mountable configuration, so we select price as a normalizer.

We also need to choose which throughput metric to normalize: overall throughput, or cache hit throughput? Overall throughput is useful for capacity planning, if you know how many HTTP requests per second your users generate. However, it does not account for the fact that we are measuring caches here. A non-caching proxy could score well on normalized overall throughput, yet lose in all other categories. The normalized hit throughput, on the other hand, uses the rate at which the tested product can deliver cache hits.

Fortunately, the actual measurements show no significant distinction between the tested products. The first three entries are the same on both charts (Microsoft-2, iMimic-1300, and IBM-220-1). The tail of the curve is also similar on both graphs with the exception of the two IBM entries that are positioned better on the request rate graph.

To emphasize the importance of caching traffic, we select the normalized hit rate graph for the baseline presentation:

Normalized Hit Rate

The normalized graph not only provides a fair comparison but answers an important question: ``How many hits per second can one thousand dollars buy?''

There appears to be no strong correlation between performance/price ratio and absolute throughput (or price): Products showing good return on a dollar can be found on both ends of the throughput scale.


Note that normalizing some of the performance measurements by product price is not needed and does not make much sense. For example, hit ratios and response times should approach some ``perfect'' level regardless of the request rate supported by the product. It is the closeness to that ``ideal'' that characterizes the quality of the product in this case, not the absolute value of the measurement. On the other hand, various throughput results, of course, do not have an ``ideal'' level and absolute throughput measurements are meaningful.

7.2 Hit Ratio

Hit ratio is a standard measurement of a cache's performance. PolyMix-3 offers a hit ratio of about 58% -- a cache cannot achieve a higher hit ratio in these tests. However, due to various overload conditions, insufficient disk space, deficiencies of object replacement policy, and other reasons, the actual or measured cache hit ratio may be smaller than the offered 58%.

Document Hit Ratio

The ``Document Hit Ratio'' chart shows how a cache maintains cache hit ratio under highest load. Improving the last cache-off results, most vendors achieved good hit ratios above 50%. Smaller cache sizes (relative to the request rates), however, did not allow IBM and Microbits to show hit ratios above 40%.

The Byte Hit Ratio (BHR) chart is less interesting because the PolyMix-3 workload does not accurately model the relationship between object size and popularity.

Cache ``Age''

The two primary reasons for losing hits are insufficient disk space and proxy overload conditions. The ``Cache Age'' chart below shows an estimated maximum age of an object purged from the cache (the objects are purged to free room for incoming fill traffic).

To estimate that maximum age, we divide cache capacity (as specified by the vendor) by the fill rate. The latter is the rate of stream of cachable misses as measured by Polygraph client. Raw fill stream measurements can be found in the individual entry reports. We believe that our formula yields a ``good enough,'' albeit not precise, approximation of real world measurements.

Cache Age

Many cache administrators believe that an production cache should store about 2-3 days of traffic. Due to the differences between the ``accelerated'' benchmarking environment and real-world conditions, the 2-3 days rule of thumb probably corresponds to some 10 hours of cache age.

The cache capacity requirement depends on your environment. When configuring a caching system based on our performance reports, make sure you get enough disk storage to keep sufficiently ``old'' traffic. You may have to increase the price and re-compute performance/price ratios if a product you are considering does not have enough storage. You should also check that the product is actually available with the additional disk space. These adjustments may significantly affect the choice of a price-aware buyer.

7.3 Response Time

To simulate real-world conditions, PolyMix-3 introduces an artificial delay on the server side. The server delays are normally distributed with a 2.5 sec mean and 1 sec deviation. These delays play crucial role in creating a reasonable number of concurrent ``sessions'' in the cache.

To simulate WAN server side connections, we introduce packet delays (80 msec round trip) and packet loss (0.05%). These delays increase miss response times and, more importantly, reward caches for using persistent connections (TCP connection setup phase includes sending several packets that also incur the delay).

The delays, along with the hit ratio, affect transaction response time. Ideal mean response time for this test is impossible to calculate precisely because the model is too complex. We estimate the ideal mean response time at about 1.3 seconds. Mean response time in a no-proxy environment is about 2.8 seconds.

Absolute response time figures are important in understanding the benchmark environment, but are of little value when comparing the cache-off results with a given real-world setup. Indeed, every particular cache deployment will have different hit ratios and server-side delays. Thus, while providing the mean response time measurements as a reference, we select the ``Response Time Improvement'' graph for the baseline presentation.

Mean Response Time Improvement

The above graph shows the relative reduction of mean response time achieved by the cache compared to a no-proxy (direct) test, or: ``How much faster will an average reply be if a cache is deployed?'' That is, it shows the (direct - proxied)/direct ratio for mean response times. We hope that the ratios reported here will be close to the real-world performance of the tested products.

Hit Ratios affect, but do not define response times. In an ideal scenario, it takes a negligible amount of time to deliver a cache hit to the client. Fast cache hits decrease average response times. In the same unrealistic scenario, it takes only about 2.6 seconds to deliver a cache miss. In practice, both hits and misses may incur significant overheads.

The hit and miss response time charts show that hits are primarily responsible for the differences in overall response time measurements.

A median response time chart is also available. Note that median response time should be interpreted with special care. Response time median is highly susceptible to average document hit ratio (DHR), essentially reporting response time for hits if DHR is higher than 50%.

7.4 Downtime test

The downtime test is designed to estimate the time it takes a product to recover from an unexpected condition such as power outage or software failure. Polygraph measures the time until the first miss (TTFM, see the chart below) and the time until the first hit (TTFH).

Time Till First Miss

From a user's point of view, the time until the first miss is somewhat more important. As soon as the caching system is able to deliver misses, the user is able to access the Web again. Delivering hits is also important to reduce outgoing bandwidth usage and from a quality-of-service point of view.

IBM-230 and Stratacache-E were not able to complete the test. These products required manual intervention to reboot (pushing the power button) because the BIOS would not allow for an automatic boot after the power has been turned back on. Such intervention is prohibited by the cache-off rules.

Most entries show negligible delay between the time until the first miss and the time until the first hit. Recall that the precision of this test is around five-ten seconds.

8. Product Configurations

Here are the configuration details for all tested products.

Label Full product name Price
(US$)
Avail
able
Cache units CPUs
(n · MHz)
RAM
(MB)
Cache disks
(n · GB)
NICs
(n · Mbps)
Rack
Space
(RU)
Cache
(GB)
Software
Aratech-2000Aratech Jaguar2000 13,150 att 1 1 · 800 1024 8 · 18 1 · 100 n/m 132 FreeBSD 4.1
CinTel-iCacheCinTel PacketCruz iCache 9,995 att 1 1 · 650 512 4 · 30 1 · 100 2 116 FreeBSD 4.1, iMimic DataReactor Core v2.0b
Compaq-b17Compaq TaskSmart C-Series b17 n/a n/a 1 1 · 800 512 2 · 18 2 · 100 1 24 ICS 1.2.94
Compaq-C2500Compaq TaskSmart C2500-2 61,995 12/00 1 1 · 1000 4096 23 · 09 2 · 1000 9 189 ICS 1.2.76
Dell-100Dell PowerApp.cache 100 2,902 att 1 1 · 650 256 2 · 09 1 · 100 1 13 ICS 1.2.94L
Dell-200x4Dell PowerApp.cache 200 Load Balanced Cache Array 50,876 att 4 4 · 866 4096 16 · 18 12 · 100 10 236 ICS 1.2.94L
F5-EDGE-FXF5 Networks EDGE-FX Cache 1.0 9,900 att 1 1 · 550 512 4 · 30 1 · 100 2 114 FreeBSD 4.1, iMimic DataReactor Core v2.0b
IBM-220-1IBM eServer xSeries 220 #1 4,108 10/00 1 1 · 800 389 2 · 18 1 · 100 n/m 30 ICS 1.2.94
IBM-220-2IBM eServer xSeries 220 #2 5,856 10/00 1 1 · 866 512 3 · 09 1 · 100 n/m 22 ICS 1.2.94
IBM-230IBM eServer xSeries 230 16,998 10/00 1 1 · 1000 1024 6 · 09 1 · 1000 5 42 ICS 1.2.94
IBM-330IBM eServer xSeries 330 7,128 10/00 1 1 · 733 256 2 · 09 2 · 100 1 13 ICS 1.2.94
iMimic-1300iMimic DataReactor 1300 6,395 11/00 1 1 · 667 512 3 · 46 2 · 100 1 132 FreeBSD 4.1, DataReactor Core v2.0b
iMimic-2400iMimic PenguinCache 2400 9,995 11/00 1 1 · 866 512 4 · 18 1 · 100 2 70 RedHat Linux 6.2, DataReactor Core v2.0b
iMimic-2600iMimic DataReactor 2600 18,995 11/00 1 1 · 866 1024 6 · 36 1 · 1000 2 210 FreeBSD 4.1, DataReactor Core v2.0b
iMimic-AlphaiMimic/Alpha Content Accelerator 29,995 11/00 1 1 · 833 2048 9 · 36 1 · 1000 n/m 315 FreeBSD 4.1, DataReactor Core v2.0b
Lucent-50Lucent imminet WebCache 50 9,950 att 1 1 · 600 512 2 · 20 1 · 100 2 36 FreeBSD 4.1, proprietary
Lucent-100Lucent imminet WebCache 100 14,950 att 1 1 · 700 768 3 · 18 1 · 100 2 52 FreeBSD 4.1, proprietary
Lucent-100zLucent imminet WebCache 100z 19,950 att 1 1 · 800 1024 5 · 18 1 · 100 2 88 FreeBSD 4.1, proprietary
Microbits-CMicrobits Business (C-2-H) 5,430 12/00 1 1 · 600 320 1 · 09
1 · 18
1 · 100 n/m 23 ICS 1.2.94.04
Microbits-PMicrobits Pizza Box (P-1-E) 2,150 att 1 1 · 266 128 1 · 06 1 · 100 n/m 5 ICS 1.2.94.04
Microsoft-1Microsoft Internet Security and Acceleration Server #1 47,991 12/00 1 4 · 700 4096 15 · 18
1 · 09
1 · 1000
1 · 1000
n/m 260 Windows 2000 Server
Microsoft-2Microsoft Internet Security and Acceleration Server #2 4,807 12/00 1 1 · 667 384 2 · 20
2 · 18
1 · 100 n/m 56 Windows 2000 Server
NAIST-1NAIST Kotetu v1.0 #1 4,474 12/00 1 1 · 733 1024 2 · 36 1 · 100 n/m 9 RedHat 6.2, Kotetu v1.0
NAIST-2NAIST Kotetu v1.0 #2 13,833 12/00 1 2 · 866 1024 6 · 36 1 · 1000 6 8 RedHat 6.2, Kotetu v1.0
NetApp-C1105NetApp C1105 10,950 10/00 1 1 · 433 512 2 · 36 2 · 100 1 64 NetCache 5.0
NetApp-C6100NetApp C6100 100,500 10/00 1 1 · 733 3072 14 · 18 1 · 1000 11 226 NetCache 5.0
Squid-2.4.D4Squid Version 2.4.DEVEL4 4,008 att 1 2 · 550 512 6 · 08 1 · 100 n/m 24 FreeBSD 4.1, Squid-2.4.DEVEL4
Stratacache-DStratacache Dart D-20 699 11/00 1 1 · 200 256 1 · 20 1 · 100 n/m 18 ICS 1.2.94 Micro Edtn
Stratacache-EStratacache Express E-55 3,295 att 1 1 · 500 256 1 · 18 1 · 100 n/m 16 ICS 1.2.94
Stratacache-FStratacache Flyer F-110 6,995 att 1 1 · 650 512 2 · 18 1 · 100 1 33 ICS 1.2.94
Swell-1450Swell CPX 1450 2,679 11/00 1 1 · 800 512 3 · 15 1 · 100 2 12 Linux 2.4.0 test8

9. Cache-Off Controversies

Controversies are a regular feature of cache-off benchmarking, and this one is no exception. While we always do our best to learn from the past, inevitably new and unexpected situations arise. This time, there are three problems that are worth mentioning here.

9.1 Software Licensing Costs

The controversy that required much of our attention after the cache-off has to do with Stratacache's Dart D-20 entry. Stratacache has priced the product for the small/home user market. The product is sold with a license that restricts it to ten users or less. Its tested throughput of 120 req/sec is much larger than ten users would typically generate.

Everyone knows that there is a correlation between licensing terms and price. For many licensed products, the licensing cost increases with the number of ``users.'' This is true for many types of licenses, such as the right to show a Hollywood movie, the right to broadcast music, and the right to use software. Some products, such as routers, do not carry per-user costs.

The central issue for us is this: Is it fair to compare the performance/price ratios of a product with a limited license to a product with an unlimited license, when the license restrictions are not enforced by the benchmark? A product with a limited license is less expensive than the unlimited license product. Thus, it has some advantage in performance/price calculations. If nothing else, this lessens the usefulness of price as a normalizer.

To continue exploring the issues, consider the following facts:

Although not necessarily a fact, people generally agree that ten users do not normally generate 120 sustained HTTP requests per second. Real numbers are difficult to come by, but one req/sec/user seems much closer to reality. Although the Stratacache Dart allows the cache to sustain 120 HTTP req/sec for long periods of time, it is unlikely that a 10 user environment would produce that consistent level of requests.

At the same time, it may be possible for a small number of users to generate high burst rates of traffic for short periods of time. Pages with many embedded images can generate tens of req/sec/user for 2-3 seconds.

Stratacache and others believe that the cache-off should simply report the raw performance of a product, regardless of it's licensing terms. They say that ``PolyMix and the cacheoffs are there to judge raw performance, not how or who we sell to in the market.'' They asked us, ``do the rules of the cacheoff require unlimited user licenses?''

Indeed, there is no such rule. PolyMix and the cache-off rules to date have never addressed licensing terms. It is something that we had failed to consider.

Stratacache also makes the point that you, the readers, are smart enough to figure out which products are appropriate for your environment.

Many of the cache-off participants feel that the way to ``win'' is to have the best performance/price ratio. The Stratacache Dart D-20 would have the highest ratios if we ignore the issue regarding its limited license. This fact probably had an undesired consequence for Stratacache. It drew attention to their entry and caused other participants to question the validity of the test.

We feel there is a good case for the argument that a straight performance/price comparison is unfair. We are concerned that readers are likely to be misled by such a comparison.

In thinking about this issue, we found ourselves to be in a very difficult position. If we allow the Stratacache Dart D-20 to use its original result, we are misleading readers into thinking that a fair comparison can be made. Furthermore, it opens the door for other companies to take advantage of licensing loopholes with PolyMix-3.

On the other hand, we are entering into dangerous, uncharted territory. If we say the comparison is unfair, it implies that we are policing the licensing terms of cache-off entries -- something we have never had to do before. What business is it of ours whether or not Stratacache's customers abide by their licensing terms? It also implies that we are willing to address the issue with the Polygraph workload in upcoming tests. That may require a workload with clear definitions of users that closely approximate real world users.

During this time we considered many different options, including:

In the end, we decided to let Stratacache use the price based on their ten user license, and also use their 120 req/sec throughput result, but to remove all of the performance/price ratios for the Dart D-20. The other measurements remain, so anyone with a calculator can figure out what the ratios would be.

Stratacache feels that they are being penalized for being innovative. Because they thought of something unique and took advantage of it, the ``losers'' are complaining and questioning the fairness of Stratacache's result.

This controversy forced us to make a decision that we would rather not worry about. No matter what our decision is, there are likely to be some negative consequences. We will need to address this issue in future caching proxy workloads and testing rules.

If you have an opinion on this controversy, we welcome your comments. You can reach us at the email address given at the top of this report.

9.2 Preparedness

When vendors arrived for testing on Monday, our team did not have all of the benches ready. Many vendors had to wait while we scrambled around making final preparations. The situation was inexcusable and we can only promise to do better next time.

Even if all benches were fully operational by Monday morning, the situation is still difficult because we experience a high demand to start tests. Starting a test takes about 15 minutes. Considering there were 30 benches and two of us starting tests, it takes many hours to get all tests going. Later in the week, tests and naturally staggered and there is usually plenty of time to help everyone.

Even though some vendors had to wait quite a while, we were able to start PolyMix-3 tests on almost all of the products by Monday night. And while emotionally unpleasant, the situation was within our rules that guarantee every vendor 55 hours of testing time. This rule is specifically designed to address high-demand situations without increasing participation costs.

9.3 Dummynet Configuration

Due to a bug in our scripts, we had to invalidate all tests started Monday and some that were started Tuesday morning. The folks from iMimic and F5 noticed that response times were a little bit too low and that the Dummynet pipes did not get configured correctly.

After conferring with all participants, we decided to throw out all previous PolyMix-3 tests and start over. Although those 12-18 hours of testing time were ``wasted,'' every participant received enough testing time during the remainder of the week.

10. Comments

10.1 Polyteam Comments

Lucent

The Lucent imminent WebCache software has a problem that we discovered after publication of the report. Working in cooperation with Lucent, we have the following explanation.

HTTP protocol describes the operation of the If-Modified-Since request-header field. The standard requires a 200 response be sent if an object has been modified since the If-Modified-Since value in the request header.

The Lucent imminet WebCache products use the "Expires" information rather than the required "Last Modified" information to determine if the client copy of an object is stale. This results in ``304 Not Modified'' responses issued instead of ``200 OK'' responses.

We estimate that about 10% of all requests are affected by the bug. The actual effect on performance is unknown and is difficult to estimate accurately.

Polygraph version 2.5.4 is not able to detect false 304 hits during run-time. Postmortem analysis of the statistics is currently the best way to discover such a problem. We performed this analysis at the request of another cache-off participant who noticed anomalies in Lucent's ``report cards.''

According to Lucent, the problem does not exist in any code delivered to customers and will be fixed before delivery to any customers.

10.2 Vendor Comments

It is a Polyteam tradition to give cache-off participants a chance to comment on the results after they have seen the review draft. The comments below are verbatim vendor submissions. Polyteam has not verified any of the claims, promises, or speculations that these comments may contain.

ARA Network Technologies Co., Ltd.
http://www.aranetwork.com

ARA Network would like to thank Polyteam for the opportunity to demonstrate that Jaguar2000 ranks as one of the best in overall performance among the various caching product entries.

Jaguar2000 raised the class of performance up to a high-end level while remaining as mid-range price. This philosophy will continue to be reflected in each of ARA Networks caching products.

ARA Network has been concentrating on the development of the best performing caching software in the industry. The outstanding performance of Jaguar2000 at the third Cache-off proved that ARA Network obtains industry-leading technology. Based on this technology, ARA Network has additionally developed the prototypes of Streaming media cache, Jaguar/MediaFlow, and the patent-pending client cache, Jaguar/BrowserCache. The delivery of these products is scheduled, along with Jaguar2000, expected to be available for purchase shortly for mission critical caching needs. The distribution model of Jaguar2000 will have better price/performance ratio covering various product range. ARA Network plans to distribute its distinguished caching products in partnership with server vendors and worldwide resellers. ARA Network looks forward to showcasing additional members of the Jaguar2000 family as well as other high quality caching products at future Cache-offs. ARA Network plans to expand its caching products ranging from low to high-end, gigabit Ethernet, streaming media caching products and CDD systems that have many additional attributes. Especially, database entry caching product is under development.

CinTel
http://www.cintel.co.kr

The CinTel iCache successfully completed the Polymix-3 tests with flawless and top-tier performance. The competitive price for this caching server combined with outstanding overall performance metrics provide an excellent resource to a greater spectrum of individuals and businesses for optimal caching solution with an affordable blend of power and manageability.

CinTel thanks Polyteam for their dedicated efforts in developing the Polygraph benchmarking software and coordinating the Cacheoff event to a success. We look forward to showcasing the iCache at future TMF Cacheoffs with enhanced features and performance through our dedication in R&D and strategic alliances.

Compaq Computer Corporation
http://www.compaq.com/tasksmart/

The TaskSmart C-Series server (C2500 -2) is a carrier class solution offering performance, scalability, availability and manageability demanded in large data center environments. The TaskSmart C-Series server demonstrates leading web acceleration performance with throughput of 2,398 req/sec. This was achieved with a phenomenal cache hit rate of 53.39% and low response time of 1.4 seconds. It is optimized to handle as much as 2,706 req/sec with only negligible change in user response time. The TaskSmart C-Series is therefore optimal for much higher volume traffic conditions that occur from time to time on large networks. For environments requiring still higher performance, Compaq TaskSmart C-Series servers can be clustered behind a Layer 4 switch. Compaq has previously demonstrated near linear scalability of performance on up to 8 TaskSmart C-Series servers.

TaskSmart C-Series servers come with a full complement of hot-swap redundant components, enabling the server to continue operating through the loss of critical components such as power supply, disk drive, or drive cooling fan. Further fault tolerance can be achieved by clustering two or more servers.

In this fast-changing environment, manageability has become one of the leading success criteria for today's highly competitive business environment. All C-Series servers deliver industry leading manageability with Compaq Insight Manager XE. The solution comes with all the software and hardware necessary in one integrated, optimized and tuned package for fast and easy deployment within an existing network. Additionally, initial TaskSmart C-Series server configuration can be done remotely with Compaq's Offline Configuration Utility, thus removing the need for on-site technical expertise and additional equipment to set up the server.

The TaskSmart C-Series offers an unbeatable web acceleration solution for today's Internet economy.

Dell
http://www.powerapp.com

The Dell PowerApp.cache solutions are clear examples of our commitment to high performing, value priced Internet Infrastructure Appliances. The PowerApp.cache 200 "Load Balanced Cache Array" (Dell 200x4) demonstrated the highest level of "Peak Throughput", obtaining 33% more "Request Per Second" than the next highest entry while maintaining excellent secondary performance metrics. This is a premier example of linear cache scalability suitable for very large network environments. The PowerApp.cache 100 (Dell 100) is an example of our initial cache building block, which will allow a growing organization to purchase a superior price/performance cache system that can create a scalable caching solution. Both the base 1U: PowerApp.cache 100 and the 2U: PowerApp.cache 200 are currently available for purchase.

Dell is committed to continuing to improve the overall value of the Dell PowerApp appliance product offering. Our current strategy for adding value to the PowerApp product line includes:

Dell PowerApp appliances are available with Dell's "Business Care" support program including:

In conclusion, we firmly believe that Dell is the best choice for all of your Caching and Internet Infrastructure needs. If you have any questions, please visit our web site at or contact a sales representative at 1-800-BUY-DELL.

F5 Networks
http://www.f5.com

F5 Networks is pleased to have participated in the TMF Cache-off with its recently-launched EDGE-FX Cache. For real-world performance, it is useful to consider the following.

We would propose a chart that describes value vs. performance (i.e., requests/second/$1,000 vs. performance). The normalized throughput chart (requests/sec/$1,000) is useful to see the relative value of certain products, but a value vs. performance chart would allow you to pick a performance range and look for the best value solution within that range. At 79 requests/sec/$1,000, the F5 EDGE-FX Cache is a top performer for products performing in its range.

The EDGE-FX Cache excels in overall response time: it is among the top 5 vendors in this category at 1.43 seconds. From a Web user's perspective, what matters most is overall response time of a Web site. To excel at high overall response time, a cache must be fast at retrieving cached objects and making proxy connections to a Web server.

By deploying the EDGE-FX Cache in an environment that had no caching solution before, the Cache-off results predict a 49% improvement in response time. Other vendors range from 29% to 51% improvements for Web site response time. A 50% improvement in response time is the real-world results the EDGE-FX Cache aims to achieve.

We are pleased to note that Dell's cluster entry used the Dell PowerApp BIG-IP - based on F5 Networks' Local Area Load Balancing BIG-IP software -- to achieve scalability and high performance. We will be benchmarking and releasing our own EDGE-FX Cache cluster with the BIG-IP Controller to demonstrate our high-performing, scalable F5 load-balanced caching solution. Additionally, as an OEM partner of iMimic, we are pleased with the strong results generated by iMimic entries.

F5 would like to thank Polyteam and we look forward to future benchmarking contests.

IBM
http://www.pc.ibm.com/us/netfinity/ics

New IBM eServer xSeries Cache Solution delivers winning Internet cache performance.

Demonstrating outstanding performance in Internet caching services tested by TMF, IBM's new xSeries Cache Solution was among the top performers in its price range.

"The Measurement Factory test team did an outstanding job of running a fair and professional evalutation", said Greg Young, Product Manger for IBM's xSeries Cache Solution. "Customers are sure to benefit from having an objective method to compare vendor results."

More on the test

The IBM eServer xSeries 220 was among the top performer in it's price range. During the grueling 4 day test, the new eServer xSeries performed flawlessly with no spikes or downtime. This new generation server with superior capacity, sets the standard for e-business cache solutions targeted at small to medium size businesses, enterprise networks, and NetGen companies.

For more information on the entire line of eServer xSeries Cache Appliance Solutions, visit http://www.pc.ibm.com/us/netfinity/ics

iMimic Networking, Inc.
http://www.imimic.com

iMimic Networking is very pleased to showcase the ongoing technical leadership of the iMimic DataReactor Core Web caching software technology. Through our own DataReactor and PenguinCache appliances and in conjunction with our OEM partners and hardware partners, the DataReactor Core software powered six of the entries in the Cacheoff. The results clearly show that these entries provide the best combination of price-performance, response time, and hit ratios while setting new records in throughput and rack density.

iMimic DataReactor Core provides a new level of portability for high-performance systems. DataReactor Core was the only Web caching software system to power Cacheoff entries representing 2 different architectures (x86 and Alpha) and also the only high-performance Web caching software system shown supporting 2 different operating systems (FreeBSD and Linux). The following paragraphs describe each of the systems shown by iMimic Networking, Inc.

The iMimic DataReactor 2600 sets a new performance record for rackmount systems with a 2U form factor, achieving over 1900 requests per second -- nearly twice the throughput of the nearest 2U competitor. With the best space-efficiency at the Cacheoff, the best price-performance for Gigabit-class caches, and the best mean response time for Gigabit-class caches, the iMimic DataReactor 2600 is ideally suited for high-performance service providers and demanding end users alike.

The iMimic DataReactor 1300 sets a new performance record for 1U systems, not only topping the throughput of the nearest 1U competitor by over 70%, but also achieving greater throughput than all but one 2U cache powered by other software systems. The DataReactor 1300 was also exhibited in stand-alone transparent caching mode with no degradation in either throughput or mean response time and no external hardware requirements for transparency.

The iMimic PenguinCache 2400 breaks new ground as the first high-performance Linux Web caching solution, achieving not only 6 times the throughput of the nearest Linux competitor, but also better mean response time and price-performance as well.

The iMimic/Alpha Content Accelerator is an ideal engine for high-end content-delivery. Featuring the Alpha Processor, Inc. (API) UP2000+ with a high-performance Alpha 21264 processor and memory system, the iMimic/Alpha Content Accelerator achieves better throughput than any other system priced up to twice as high while also maintaining the best cache hit response time among Gigabit-class caches. It is ideally suited for high-performance in both forward and reverse-proxy modes of operation.

DataReactor Core powered a total of 6 Cacheoff entries representing two levels of performance. In the Enterprise class of Web caches (500-1000 requests per second), iMimic DataReactor Core powered the caches with the three best mean response times, 4 of the top 5 price-performance metrics, and 4 of the top 5 hit ratios.

In the Gigabit class of Web caches (1000+ requests per second), the caches powered by iMimic DataReactor Core achieved the two best mean response times, the two best price-performance metrics, and 2 of the top 3 hit ratios.

iMimic remains committed to affordable high-performance Web caching solutions. DataReactor Core powered the 4 highest performance caches with list prices under $10,000 (CinTel PacketCruz iCache, F5 EDGE-FX, iMimic DataReactor 1300, and iMimic PenguinCache 2400), the highest performance cache under $20,000 (iMimic DataReactor 2600), and the 2 highest performance caches under $40,000 (iMimic/Alpha Content Accelerator and iMimic DataReactor 2600).

The DataReactor Core Web caching software achieves a unique combination of performance metrics and an unparalleled level of disk and resource efficiency. This fully-featured next-generation Web caching technology is available now for OEM licensing -- supercharge your Web cache with DataReactor Core!

iMimic Networking thanks Polyteam for their efforts in improving the Polygraph benchmarking software and coordinating the Cacheoff event. Please visit our web page for more information on iMimic Networking and our DataReactor web caching solutions.

Lucent Technologies
http://www.lucent.com/serviceprovider/imminet/

Lucent Technologies is pleased to be able to demonstrate the speed and performance of its imminet WebCaching products at this important event. Not only are the imminet products among the fastest in the world, they deliver a full range of performance characteristics at a competitive price. Our WebCache 50 product, designed for the medium sized ISP, can easily handle 6 T1 lines of traffic. Our WebCache 100 is designed to handle a full T3 line worth of traffic, and the WebCache 200 (referred to as the WC 100Z in the report) can handle two full T3 lines of traffic.

All of the products are available now, and feature a compact 2U stackable, rack mountable form factor. In addition, accurate caching rules for HTTP and FTP, address bypass, 10 Year Mean Time Between Failure, administrative control through both a secured Web interface and a command line interface, activity logging, and usage statistics complement their performance characteristics. Lucent's WebDirector products can be used to cluster the WebCaches into OC-3, OC-12 and server accelerator solutions. Lucent Sales and Service experience back all of these products.

Contact imminet@lucent.com for more information.

Microbits
http://intelliapp.microbits.com.au

Microbits is once again pleased to participate in the IRCache bakeoff.

The Microbits Intelli-App range of caching appliances provide a true appliance approach with installation in minutes, no specific OS expertise required, low maintenance and fault tolerant operation.

With a proven track record (and having survived two bakeoffs!) the Microbits Pizza Box (P-1-E) continues to provide a low cost, high performance, value for money solution for organisations with Internet access of less than 10Mbit - the majority of users in the world.

The Intelli-App C-2-H is a new model in the Intelli-App range and complements the other 6 models to cater for a wider range of deployment scenarios.

In achieving one of the top price performance figures in this bakeoff the C-2-H offers excellent value for money in the low to mid range caching solutions.

Microsoft Internet Security and Acceleration Server 2000
http://www.microsoft.com/ISAServer

Microsoft Internet Security and Acceleration Server (ISA Server) is Microsoft's new enterprise firewall and web caching server that provides the scalable web performance, security and policy-based management required by today's Internet-enabled businesses.

In the Third TMF Web Cache-Off, ISA Server demonstrated excellent price/performance, fast web content delivery, and significant bandwidth savings.

Excellent Price/Performance: ISA Server was the overall leader in the Price/Performance category, establishing the ability to deliver fast Internet access and lower network infrastructure cost at a superb value.

Fast Web Content Delivery: ISA Server's Peak Throughput results secured its position in the top five overall performance leaders. ISA Server will scale even further with its multi-processor design, optimized cache store, and efficient CARP (Caching Array Routing Protocol) clustering to meet the needs of the most demanding Internet environments.

Significant Bandwidth Savings: Higher Hit Ratios translate to bandwidth cost savings and improved web response by delivering more content from the cache rather than from the origin server. A document Hit Ratio of 55.2% out of 57% demonstrates the ability to deliver exceptional savings and performance.

In addition to the fast web cache, ISA Server raises the bar by integrating firewall security, advanced caching features, policy-based management and an extensible architecture. Features include:

With ISA Server, enterprise customers have a solution that provides exceptional Web access performance, strong Internet security and centralized management.

NAIST
http://infonet.aist-nara.ac.jp/products/kotetu/

NAIST would like to thank the Polyteam for the opportunity to test our system.

The Kotetu is a prefetching proxy server and designed for small and middle work group. Since it is a free software and runs upon user level, user can choice its platform.

In 3rd Cache-Off, we participated 2 entries with disk-base Kotetu. Prices of our entry in this report were street price, because Kotetu is not packaged to sell. First (NAIST-1) was small and cheap. Second (NAIST-2) was more large and expensive than NAIST-1. NAIST-1 was not able to reach end of Polymix-3 test. NAIST-2 reached to end, but its result did not satisfy its goal.

Kotetu is brand new program. We are developing a new version of the program. You will get more better version in near future via our WWW page. You can get memory-base Kotetu, also. It is focused on through-put and short term hit ratio with prefetching.

Network Appliance
http://www.netapp.com/products/netcache/

Network Appliance continues to support the development of Polygraph as the leading vendor-independent benchmark for caching performance. We are pleased with the development of this benchmark; Polymix-3 is much more demanding than any previous Polygraph workload and more accurately reflects how a cache will perform in a real world deployment. Both our entry level and high-end solutions had excellent HTTP performance. More importantly, our content delivery appliances provide this scalable performance with an advanced feature set that has made them the solution of choice for major Telecomm providers, Fortune 500 companies, and CDNs around the world. These advanced features include an end-to-end content delivery solution, industry leading streaming media performance, and a full suite of additional value-added services.

squid-cache.org
http://www.squid-cache.org

When looking at these Squid results, please keep the following points in mind:

The people who work on the Squid code, and bring it to the cache-off, also work for The Measurement Factory. There is a potential conflict of interest situation here.

Due to other demands, Squid tests were not started until Wednesday afternoon. By the time we left Houston on Saturday morning, Squid did not have any successful tests, and did not accumulate 55 hours of testing. We completed testing upon return to Boulder, Colorado.

After the last test was started, we discovered a rather serious bug in the code that causes a skew in the way files are distributed to cache disks. The first disk has too many files, and the last disk has too few. This causes a significant performance degradation because the low numbered disks are over-utilized relative the the high numbered ones. With this bug fixed, Squid is able to achieve 150 TPS under PolyMix-3.

Also note that the equipment and software used in this, the third cache-off is largely similar to what we used for the previous test in January 2000. The only difference is faster processors, and changes to the software.

Stratacache Dart
http://www.stratacache.com

"Those who hit the beach first catch the most lead"

The Stratacache Dart D-20 is the first high performance, micro site caching appliance designed for the broadband attached home, small business or SOHO environment. This product is revolutionary because it takes high performance caching, content acceleration and content delivery not only to the last mile, but to the last foot. The Stratacache Dart is the "missing link" infrastructure component necessary to complete the rich multimedia based broadband home/small business network of the future. With roughly 5 million households with broadband connectivity today and projections upward of 54 million home and small business broadband users by 2004, this is a highly targeted product focusing on an explosive market segment.

Unfortunately, our Dart appliance was revolutionary enough to disturb the majority of the other vendors involved in the Cache-off. At $ 699 running 120 TPS, the Dart delivered 153 Requests per second per $ 1,000 and 86 Hits per second per $ 1,000 while sustaining a 56% hit rate. These performance figures win the top spot for hit rate/price per $ 1,000, but since they come from a non-conformist product, they are omitted from the performance chart.

The Stratacache Dart ships with a standard Fast Ethernet connection. Closely following the initial Dart release are models with optional integrated router interfaces for DSL, Cable Modem, Wireless or Satellite connectivity plus 802.11 or Bluetooth local site connectivity options.

Stratacache provides a comprehensive line of high performance, feature rich content acceleration products based on the leading ICS Software Technology. Our appliances serve the full range of acceleration solutions from $ 699 at the micro site up to $ 130,000 for large carrier class appliance units.

For additional information, contact us directly at (800)-244-8915 - (937)-224-0485 or visit our web page at www.stratacache.com.

Stratacache Express and Stratacache Flyer
http://www.stratacache.com

Caching appliances are a key component in an overall content acceleration strategy, but they are not the only factor in enabling a high performance Internet/Intranet/Extranet experience. In choosing the appliance that is best suited for your enterprise network environment, consider not only the raw performance of the product, the base hit rate, or the amount of content storage, but rather, judge the appliance on it's features and capabilities relative to your enterprise network needs.

The Stratacache product line is based on a high performance hardware and software architecture providing excellent throughput, fast delivery and solid data hit rate; but we truly excel in the marketplace with our extensive product features and quality pre and post sales engineering and technical support.

Base product features including user authentication and control, system logging, easy appliance setup and management, robust performance monitoring, and active content pre-fetch provide the foundation for a great product. Optional features such as integrated Layer 4 switching, content filtering, Quality of Service based acceleration, content management and auto-distribution, plus integrated satellite network product enhancements provide targeted solutions to many common problems not resolved with traditional caching products.

Stratacache also leverages our relationships with key connectivity providers, content distribution networks and public Internet transport acceleration providers to optimize the delivery of relevant Internet content to the enterprise network environment.

Stratacache provides a full line of high performance content acceleration appliances based on the leading ICS Software Technology. Our appliances range from $ 699 at the micro site, $ 3,000 to $30,000 for the enterprise network, and up to $ 130,000 for large carrier class units. We also provide a full range of products for web server, secured content (SSL) and streaming media acceleration.

For additional information, contact us directly at (800)-244-8915 - (937)-224-0485 or visit our web page at www.stratacache.com.

Swell Technology
http://www.swelltech.com

Swell Technology greatly enjoyed this, our second, web cacheoff event. The Measurement Factory once again provided a great service to the IT community by providing a fair and accurate proving ground for web caches.

We appreciate the opportunity to display the performance improvements that have been made in our systems and software since the last cacheoff. It has been our goal for some time to help evolve Squid into a scalable, efficient, and reliable alternative to proprietary caching systems. These results show our progress so far, and now that the cacheoff is over, we have gone back to work on that goal.

Swell would also like to express our gratitude to the Squid core developers, as well as the ReiserFS development team. They've done wonderful things with Squid in the past few months, and we look forward to continuing the work with them.

Swell Technology offers a flexible, affordable, and open, alternative to proprietary web caching appliances. We've once again shown a product with good performance and an affordable price. Overall throughput and price/performance has improved significantly since the last cacheoff. It is also worth noting, that our downtime recovery problem from the previous event has been fixed, and we now exhibit the fastest recovery in our price range.

In conclusion, with a nearly ideal hit ratio, quick response times, and nearly instant downtime recovery, the CPX-1450 provides an excellent level of service for moderate sized ISP, business, and school networks.



$Id: report.sml,v 1.9 2000/10/11 16:40:29 wessels Exp $