1. Install FreeBSD
2. Download and Compile Necessary Software
2.1 Install Polygraph
2.2 Install netperf
2.3 Install gnuplot
3. Understanding PolyMix-4 IP Addressing
3.1 PolyMix-4 Routing
4. Test Your Network
4.1 Manually add some alias addresses
4.2 Make sure clients and servers can ping each other
4.3 Netperf tests
5. Set up the PolyMix-4 workload
5.1 Edit workload files
5.2 DNS in PolyMix-4
5.3 Run the routing configuration scripts
5.4 Configure Dummynet
6. Run a no-proxy test
7. Prepare your proxy for testing
7.1 Ping the proxy
7.2 Run msl_test
8. Test your Proxy
8.1 Copy all ".pg" files to every client and server
8.2 Start polysrv processes
8.3 Start polyclt processes
9. Run the downtime test
10. Analyze the results
10.1 Copy all log files to a single location
10.2 Label the logfiles
10.3 Generate a report
10.4 View the results
We recommend running Polygraph on FreeBSD. We use FreeBSD for all our official tests, including the Cache-Off. You can use another Unix operating system if you really need to.
Recommended minimum hardware:
To install FreeBSD, please see our Setting Up FreeBSD page.
Get Polygraph version 2.7.4 from http://www.web-polygraph.org/downloads/. Unpack and install it:
% cd /tmp % wget http://www.web-polygraph.org/downloads/srcs/polygraph-2.7.4-src.tgz % tar xzvf polygraph-2.7.4-src.tgz % cd polygraph-2.7.4 % ./configure --prefix=/usr/local/polygraph-2.7.4 % make all % sudo make install
Add /usr/local/polygraph-2.7.4/bin to your PATH, or perhaps make symbolic links in the /usr/local/bin directory.
Later versions of Polygraph may be used. If you are trying to reproduce other results, be sure to use the same version that others have used. Not all Polygraph versions are backward-compatible.
Get and install netperf from www.netperf.org or from our FTP site.
NOTE: netperf-2.1pl3 does not compile out-of-the-box on FreeBSD. Before running 'make' you need to edit makefile and add __FREEBSD__ to line 86:
CFLAGS = -O -D$(LOG_FILE) -DUSE_LOOPER -D__FREEBSD__
To use the automatic report generation programs, you'll need to install gnuplot with PNG support.
You can get gnuplot from ftp.gnuplot.vt.edu. You'll also need libpng-1.0.11.tar.gz and zlib-1.1.3.tar.gz, which you can find in the same FTP directory.
Installing gnuplot looks like this:
% ftp ftp.gnuplot.vt.edu ftp> cd pub/gnuplot ftp> get gnuplot-3.7.1.tar.gz ftp> get libpng-1.0.11.tar.gz ftp> get zlib-1.1.3.tar.gz ftp> bye % tar xzf zlib-1.1.3.tar.gz % cd zlib-1.1.3 % ./configure && make % sudo make install % tar xzf libpng-1.0.11.tar.gz % cd libpng-1.0.11 % ln -s scripts/makefile.std Makefile % make % sudo make install % tar xzf gnuplot-3.7.1.tar.gz % cd gnuplot-3.7.1 % ./configure --with-png % make % sudo make install
For PolyMix-3, we had two addressing schemes: flat and routed. Most tests used the flat configuration, in which clients and servers are on the same subnet. Although the flat network configuration is simpler, it has some undesirable characteristics. Foremost, each Polygraph PC (and the proxy) end up with excessively large ARP tables -- one ARP entry for each IP alias address. This is an unrealistic situation. Real web clients and servers do not have ARP tables with thousands of entries.
For PolyMix-4 we bind the client and server alias address to loopback interfaces and use the real network interfaces as routers. This keeps ARP tables smaller because each machine needs just a single MAC address for each other machine. However, it also complicates the whole setup because we need to configure a full mesh of routes on each PC.
The standard PolyMix-4 addressing scheme is shown in the figure below:
In the above figure, X represents a "bench-id." It is the only part of the IP addresses that you should change. Cache-off participants are assigned bench-id's on the first day of testing.
Each Polygraph client and server uses a group of addresses that fit into "/22" subnet bound to the loopback (lo0) interface. The fxp0 interfaces (on the 172.16.X.0/24) subnet act as routers. Thus, each client, server, and proxy needs a routing table so that the they can talk to the Polygraph agents, which are bound to the 10.X.0.0 addresses. We'll talk more about routing in a while.
This addressing scheme allows for up to 31 client/server pairs. If each pair generates 500 TPS, the total maximum throughput for a PolyMix-4 test is 15,000 TPS.
NOTE, the figure shows multiple proxies, but multiple proxies are allowed only with interception caching configurations. In that case, the Ethernet switch must have L4/7 features and be configured to intercept HTTP traffic and divert it to one or more caching proxies.
We expect that some caching proxies may not support complicated routing tables (as are required in this scheme). In this case, the Ethernet switch may be configured as a router, and the caching proxy may use the switch's IP address as a default route. The switch must then be given the routes so that it knows how to reach the different 10.X.0.0 subnets.
We also expect that some cache-off entries may not support complicated routing in the proxy, AND do not want to use a routing Ethernet switch. In this case the rules allow a router to be used without affecting the reported price. This configuration is shown in the following figure:
The following shell script creates the routes necessary for a PolyMix-4 test. You'll need to replace X with your bench-id before running the script. Of course, the script must run as root to modify the routing tables.
You must run the script on each Polygraph PC (including monitoring PC if you have one) and the caching proxy. If your caching proxy does not support complicated routes, and you're using the router option, then the router must be configured with similar routes.
#!/bin/sh X=13 p=1 while test $p -lt 32; do j=`expr \( $p - 1 \) \* 4` k=`expr $j + 128` c=`expr $p + 60` s=`expr $p + 190` route add -net 10.$X.$j.0/22 172.16.$X.$c route add -net 10.$X.$k.0/22 172.16.$X.$s p=`expr $p + 1` done # in case the proxy is behind a router route add -net 10.$X.124.0/22 172.16.$X.30
If you'll be running a lot of tests, then you probably want to make sure that script runs automatically each time a system is booted.
If your bench has less than 31 client-server pairs, the above script will create some routes that will not be used during the test. That is not a problem.
As of Polygraph version 2.6, the polyclt and polysrv processes automatically create IP alias addresses. However, in order to test your network setup, you'll need to manually add some aliases. You can just use the ".1" address at the beginning of each /22 subnet group. In these examples, the prompt shows the hostname where you should run each command:
clt01# ifconfig lo0 alias 10.X.0.1 netmask 255.255.192.0 clt02# ifconfig lo0 alias 10.X.4.1 netmask 255.255.192.0 clt03# ifconfig lo0 alias 10.X.8.1 netmask 255.255.192.0 ... srv01: ifconfig lo0 alias 10.X.128.1 netmask 255.255.192.0 srv02: ifconfig lo0 alias 10.X.132.1 netmask 255.255.192.0 srv03: ifconfig lo0 alias 10.X.136.1 netmask 255.255.192.0
You can use ping to test routing and connectivity. Be sure to use the -S option to set the source IP address to one of the loopback alias addresses. For example, to ping the first server from the first client:
% ping -S 10.X.0.1 10.X.128.1
You should take the time to ping more than just one server:
% ping -S 10.X.0.1 10.X.132.1 % ping -S 10.X.0.1 10.X.136.1 ...
And ping from other clients as well:
% ping -S 10.X.4.1 10.X.128.1 % ping -S 10.X.4.1 10.X.132.1 % ping -S 10.X.4.1 10.X.136.1 ...
Start the 'netserver' daemon on every polygraph machine:
Then Run the netperf client on each polygraph machine. For example:
srv01# netperf -l 30 -H 10.X.0.1 -t TCP_STREAM clt01# netperf -l 30 -H 10.X.128.1 -t TCP_STREAM
You should make sure that a client-server pair runs netperf in both directions at the same time. This guarantees that your network is operating well in full-duplex mode. If everything is good, netperf reports a throughput of about 80 MBit/s.
For a unidirectional netperf test, you should get about 92-95 MBit/s.
For longer tests, increase the -l <length> value.
When editing and understanding PolyMix-4 workload files, note that all PolyMix-4 input parameters are set as totals perceived by the cache(s) under the test. If a device under test is comprised of several caching units, treat it a single "big" cache for Polygraph configuration purposes.
You should use the exact same configuration files for all polyclt and polysrv processes. No manual adjustments for the number of polyclt processes is needed; all adjustments are done automatically in polymix-4-guts.pg file which is included from the polymix-4.pg file.
When in doubt or puzzled by a contradicting or insufficient documentation, do not try to guess the right setting; double check with us instead.
Copy polymix-4.pg from the /usr/local/polygraph-2.7.4/workloads/ directory into a new working directory.
Do NOT edit any of the files in the workloads/include directory.
All of our examples here use X to represent the bench-id variable. You'll need to choose a value for X in your own testing. At the Cache-off, bench-id values will be between 100 and 199.
Edit your copy of polymix-4.pg and define the following variables:
TheBench.client_side.addr_space = [ 'lo0::10.X.0-123.1-250/22' ];
TheBench.server_side.addr_space = [ 'lo0::10.X.128-251.1-250:80/22' ];
TheBench.client_side.hosts = [ '172.16.X.61-62' ];
TheBench.server_side.hosts = [ '172.16.X.191-192' ];
Also note that this peak rate value is used to determine which IP addresses to use for robot and server agents.TheBench.peak_req_rate = 1000/sec;
rate FillRate = 75% * TheBench.peak_req_rate;
or justsize ProxyCacheSize = 50GB + 4GB;
size ProxyCacheSize = 54GB;
Resolver.servers = [ '172.16.X.100' ];
Given the above settings, a complete PolyMix-4 configuration looks like this:
#include "benches.pg" Bench TheBench = benchPolyMix4; TheBench.client_side.addr_space = [ 'lo0::10.X.0-123.1-250/22' ]; TheBench.server_side.addr_space = [ 'lo0::10.X.128-251.1-250:80/22' ]; TheBench.client_side.hosts = [ '172.16.X.61-62' ]; TheBench.server_side.hosts = [ '172.16.X.191-192' ]; TheBench.peak_req_rate = 1000/sec; rate FillRate = 75% * TheBench.peak_req_rate; size ProxyCacheSize = 50GB + 4GB; Resolver.servers = [ '172.16.X.100' ]; #include "polymix-4-guts.pg"
PolyMix-4 uses hostnames, rather than IP addresses, in URLs that robots request. To make this work, you need to create DNS zone files and place them on a DNS server on the test network. PolyMix-4 has a fixed name-to-address mapping scheme, and we'll assume that you'll run the BIND DNS server on the monitoring PC.
Before proceeding, you may want to read about Using DNS in the Polygraph User Manual.
Use the dns_cfg command from the Polygraph distribution:
% dns_cfg --config polymix-4.pg --cfg_dirs /usr/local/polygraph-2.7.4/workloads/include/
This should output three files (two to disk, one to stdout). The stdout output is a basic named.conf that you can use. Simply copy it to /etc/namedb/named.conf. You'll also have two zone files in the current directory: bench.tst and bench.tst.rev. Copy these to /etc/namedb as well.
To start named, simply run this command as root:
Check that named is running correctly, look at the /var/log/messages file (or wherever your syslog messages go). You should see something like this:
Aug 2 15:42:38 mon40 named: starting (/etc/namedb/named.conf). named 8.2.3-REL Fri Mar 23 15:56:02 MST 2001 wessels@mr-garrison:/usr/obj/usr/src/usr.sbin/named Aug 2 15:42:38 mon40 named: limit files set to fdlimit (1024) Aug 2 15:42:39 mon40 named: Ready to answer queries.
You can also check that named is working by making queries with dig:
% dig @172.16.X.100 w1001.h1128o1Xs1010.bench.tst ; DiG 8.3 @172.16.XXX.100 w1001.h1128o1XXXs1010.bench.tst ; (1 server found) ;; res options: init recurs defnam dnsrch ;; got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 6 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1 ;; QUERY SECTION: ;; w1001.h1128o1XXXs1010.bench.tst, type = A, class = IN ;; ANSWER SECTION: w1001.h1128o1XXXs1010.bench.tst. 1H IN A 10.XXX.128.1 ;; AUTHORITY SECTION: bench.tst. 1H IN NS ns.bench.tst. ;; ADDITIONAL SECTION: ns.bench.tst. 1H IN A 172.16.XXX.100 ;; Total query time: 2 msec ;; FROM: monXXX to SERVER: 172.16.XXX.100 ;; WHEN: Fri Aug 3 12:52:57 2001 ;; MSG SIZE sent: 49 rcvd: 98
If you haven't already executed the routing script (given previously), do so now. You may need to run the same or similar script on your caching proxy, or configure similar routes on your switch/router.
On all polygraph clients, run:
# ipfw -f flush
On all polygraph servers, run
# ipfw -f flush # ipfw pipe 1 config delay 40ms plr 0.0005 # ipfw pipe 2 config delay 40ms plr 0.0005 # ipfw add pipe 1 ip from any to 10.X.0.0/16 in # ipfw add pipe 2 ip from 10.X.0.0/16 to any out
Check your work! Ping a client from a server and you should see round trip times of about 80 msec.
Before starting these tests you should reboot all clients and servers to give them a "clean" configuration.
The polygraph clients and servers should be able to sustain your peak request rate without a proxy cache involved. The proxy cache must not be connected to the network during this test.
On each polygraph client you would run:
Similarly on the servers:% polyclt --config polymix-4.pg --verb_lvl 10 --ports 3000:30000
You may want or need additional polyclt/polysrv options. For example, the location of the "workloads/include" directory (or its copy) needs to be specified using the "cfg_dirs" option; logging may be enabled using the "log" option; etc.% polysrv --config polymix-4.pg --verb_lvl 10
We recommend running the no-proxy test for 30-60 minutes at peak load. To create your custom no-proxy workload, follow these steps:
% cd polygraph-2.7.4/workloads/ % cat polymix-4.pg include/polymix-4-guts.pg > /tmp/my.pg
The difference in response time among phases should be marginal in a no-proxy test. Response times should be about 2.8 seconds. If the reply rate and response time look good for at least 30 minutes of peak load, you can stop the no-proxy test. If response time looks bad, re-examine your network setup or workload config.
Make sure your proxy has an address on the subnet and that all clients and servers can ping it. Don't forget about the -S option:
% ping -S 10.X.0.1 172.16.X.32
From a polygraph client or server machine, run the msl_test program against your proxy. This program uses some low-level IP packets to determine the MSL setting for your TCP stack.
Sample usage is:
clt01# ./msl_test -i fxp0 -s 10.X.0.1 -d 172.16.X.32 -p 8080
The final argument (port number) should be the port number where your proxy accepts requests. It can not be any random port.
During this test, you will not be able to send any other traffic from the source machine to the proxy.
When finished, the program reports the TIME_WAIT value that it found. This value is twice the MSL value. Cache-off rules require the TIME_WAIT value to be 60 seconds. If the msl_test program reports a number smaller than 60 seconds, you may be in violation of the rules. Violators will be disqualified.
For more information, read msl_test.html.
% polysrv --config polymix-4.pg --verb_lvl 10 --log srv.log
% polyclt --config polymix-4.pg --verb_lvl 10 --log clt.log
NOTE: you may want to use additional or different command line parameters. For example you may want to save the polygraph stdout/stderr to a file for later reference.
If you need to start polygraph on many machines, you may want to use the "bb.pl" script from the polygraph source distribution.
We usually monitor experiments using the 'polymon' program. In order to use 'polymon' you must must use the --notify option to polyclt and polysrv. You must also run the udp2tcpd deamon on the host that is receiving the notification messages.
Setup a single polygraph client machine and a single polygraph server machine to use the downtime-2.pg workload file (found in the workloads directory of the source distribution).
During the run, turn off the power to all equipment in the "participant zone", which includes your proxy and your networking gear. Start a stopwatch or timer.
Wait five seconds.
Return power to proxy and networking gear.
Watch the polyclt console output. Note when the first cache miss is successfully received by the client.
Also note when the first cache hit is successfully received.
Polygraph includes a set of scripts, called ReportGen that you can use to display the results.
Use label_results to label logfiles with a single name.
% cd /usr/local/polygraph-2.7.4/ReportGen ./label_results mytest1 /where/ever/clt.*.log
Use make_report to make graphs and an HTML page describing the results:
% ./make_report mytest1
Use netscape or another browser to view the report. You may need to copy the files to an HTTP server.