Re: Performance benchmarking BOF

From: Alex Rousskov (rousskov@ircache.net)
Date: Tue May 30 2000 - 11:12:17 MDT


Dear all,

        Web performance benchmarking BOF was held during the 5th Caching
Workshop in Lisbon. Here is a brief summary of the discussions. If you
want to contribute to the discussion please use the following forums:

        Content Distribution -- cdd@measurement-factory.com
        Streaming Media -- stream@measurement-factory.com
        HTTP/1.1 compliance -- wrec@cs.utk.edu
        Other -- polygraph@ircache.net

Cross-postings are discouraged; list maintaners will forward relevant
messages...

Five topics where on the BOF agenda as far as future benchmarks are
concerned:

        1. Traffic redirection bake-offs (``switching'' folks)
        2. Content distribution tests (``CDN'' folks)
        3. Content-dependent tests (``keyword filtering'' folks)
        4. Streaming media tests (``caching+'' folks)
        5. SSL acceleration tests (``reverse proxying'' folks)

Items (2) and (4) were discussed in detail while others where just
highlighted. We summarize the (2) and (4) discussion below.

There was also a discussion about HTTP/1.1 compliance tests. Since compliance
tests are orthogonal to performance benchmarking, we suggest to continue that
discussion on the WREC mailing list:
        http://www.wrec.org/

Content distribution tests
--------------------------

Content distribution was a hot topic during the workshop. The BOF was no
exception. There is a lot of interest in benchmarking and evaluating
content distribution and delivery (CDD) schemes (aka Content
Distribution Networks, CDNs). There is a general dissatisfaction with
currently available "benchmarks" for CDNs.

Solom Heddaya from InfoLibria, John Dilley from Akamai, and James Aviani
from Cisco shared their thoughts about the CDN benchmark. Their
presentations were followed by a free-flow discussion.

It is clear that a CDD benchmark must test the service as a whole, not
just an isolated content distribution node. This presents several
challenges that we must overcome to create a high quality test.

The first choice we have to make is about the test environment. There are at
least three alternatives.

  i. Simulated servers, simulated Internet

        It is possible to create a test harness with tens of networks
        hosting tens of thousands of simulated servers and clients, with
        various connectivity properties. However, it is not clear
        whether such harness can be a good-enough model of the Web (for
        CDN testing purposes). The minimum number of distribution nodes
        (submerged into this artificial Web) worth testing is also
        unknown: real CDNs have thousands of nodes while it is probably
        not feasible to test more than tens of nodes in a lab.

        Lab tests are nice because they provide a controlled environment
        were results can be reproduced and verified in a fair fashion.

  ii. Simulated servers, real Internet

        A few simulated "content providers" (servers) can be deployed on
        the Internet and made available to CDD vendors. Each
        participating vendor would handle at least one simulated server
        as if it was a real content provider. All servers will be
        identical except for domain names and IP addresses (and perhaps
        minor custom content modifications required by the distribution
        schemes).

        Simulated servers give more flexibility and control to the
        tester. However, the "real Internet" component will introduce
        some randomness into the test results. It is not clear whether
        that randomness is still better than 99% controlled but
        inaccurate model of the Internet (option [i] above).

        Providing a sufficiently large network of distributed clients
        (to request data from the simulated servers) will also be a
        challenge.

  iii. Real servers, real Internet

        This is the most "realistic" environment, similar to what is
        being used by Keynote. An important drawback of this approach is
        that stress testing of real servers is not possible. Busy
        content providers will most likely reject the idea of launching
        a distributed DOS attack against their sites. On the other
        hand, the same providers are very interested in CDN vendors
        demonstrating the robustness of content distribution schemes.

Regardless of the environment, we also need to answer the following
questions:

        a) What are the metrics for a CDN benchmark? Or, equivalently,
        what are the primary objectives of the service under the test?
        Are we interested just in "average" performance metrics (mean
        throughput, response time, errors, etc.) or quality of service
        guarantees (maximums, percentiles, and other "limits" of the
        performance metrics).

        b) What is a CDN? Is RR DNS + rsync a CDN? Are clients
        always browsers? Can origin servers be reverse proxied?
        Akamai, SandPiper, Mirror Image, and others use different
        distribution schemes. Is there a good one-fits-all model?
        Do we need a taxonomy for CDNs?

It is also paramount to provide all participants with sufficient level
of control over the testing methodology. The development and testing
process should be open and should be coordinated by an independent
authority.

Several vendors suggested that a meeting to bootstrap the development
process would be useful. Let's discuss the meeting date and agenda on
our CDD mailing list:
        cdd@measurement-factory.com

Streaming media tests
---------------------

Vendors and customers alike need a good streaming media benchmark. The
development of such a tool is long overdue. However, proprietary nature
of the _popular_ streaming protocols (Real Media and Windows Media)
makes development of a public, open-source benchmark extremely
cumbersome. While client and server development kits can be licensed
from, say, Real, those kits are believed to be of poor performance and
probably cannot be integrated into a free, open-source benchmark.
Software portability may also be a big issue. Streaming vendors may not
come forward and open their protocols in fear of lost revenues.

Carlos Maltzahn from Network Appliance opened the discussion with and
excellent summary of the issues and collected opinion polls on the
popularity/importance of streaming formats.

We see three possible options to start the development:

  i. License the technology and create a closed-source benchmark.
     It is possible that this benchmark will require a lot of
     CPU power to create "interesting" loads. It is likely that
     test control and performance measurements will be relatively
     poor due to SDK limitations.

  ii. Create a benchmark based on open, albeit rarely used, QuickTime.
     While feasible to implement, such a benchmark may be a burden for
     some vendors as they would have to spend resources on QuickTime
     optimizations that their customers do not ask for.
     
  iii. Create a RTP/RTSP-based benchmark that does not necessarily
     support any existing streaming format but is close-enough to
     popular formats.

Should we have a meeting to move forward with this benchmark? Further
discussions about streaming should be directed to
        stream@measurement-factory.com

Polyteam involvement
--------------------

We intend to coordinate the development and deployment of the CDN and
streaming benchmarks as well as HTTP compliance tests. Our desire is to
keep these developments in the public domain and seek vendor/user input
whenever possible.



This archive was generated by hypermail 2b29 : Mon Feb 06 2006 - 12:00:02 MST