These are histograms from a PolyMix-4 run with Squid. Size values are taken from the content-length headers, as logged in Squid's store.log. Each URL is represented once in this data -- repeated requests have been filtered out. These histograms have variable bin sizes. The bin size is proportional to the logarithm of the size (x-value).

Type # unique objs % unique objs PGL specification Actual mean
download 30778 0.39 logn(300KB, 300KB); 0.5% 312KB
other 1580604 19.8 logn(25KB, 10KB); 19.5% 25.4KB
html 1245638 15.6 exp(8.5KB); 15% 8.96KB
image 5134496 64.2 exp(4.5KB); 65% 4.61KB
all 7991516 100   10.6KB

exp() is the exponential distribution. logn() is the log-normal distribution.