Squeezer - a tool for profiling Squid web cache server

by
Maciej Koziński

<maciej_kozinski@yahoo.com>






Description

Squeezer is a tool for gathering different statistical information from Squid web cache server to tune it fine. The tool is destined basically for web cache operators, but can be obtained and used for any other purpose. The software has been written in PERL and as Squid itself and PERL - it's free. But You use it at your own risk! Check out this page for any information about it's development and features.
You can download:


Features


Whats' new?
0.3

0.22 0.2 0.11 Usage Output and it's interpretation

The output contains several profiles of information with several different positions (categories) in each. Each category is characterized with several values which are described below:

General information tables:

General information
Start date: Wed Jun 16 00:02:26 1999
End date : Wed Jun 16 11:57:26 1999 
Total time: 42900 s 
HTTP requests per hour : 860
Transfer per hour : 5572 KB


The all values given to you by squeezer are related to the period limited by the times described above!

The general traffic and routing characteristics

No of
req
Xfer
(kB)
Xfer speed
(kB/s)
Xfer % Times
to direct
Total traffic 10257 66401 0.691 100.00 1.16
Direct fetches 6246 50355 0.594 75.83
This server 2471 11092 1.236 16.71 2.08
Other servers 1543 4859 2.725 7.32 4.59
Cache hierarchy 4014 15952 1.483 24.02 2.50


Sibling efficiency:

Server Current
relation
and options
HTTP ICP Xfer
(KB)
Xfer
%
No of req Xfer speed
(KB/s)
Times
to direct
w3cache.man.lodz.pl parent no-digest 8080 3130 2379 3 888 3.17 5.34
sunsite.icm.edu.pl parent no-digest 8081 3131 2469 3 570 2.02 3.39
w3cache.pk.edu.pl sibling no-query 8080 0 103 0 81 0.33 0.55
w3cache.man.szczecin.pl sibling no-digest 8080 3130 0 0 4 0.00 0.00

This table shows you the characteristics of the fetches from individual cooperating web cache servers. This one will help you to set up properly sibling relationships. The servers are sorted by the amount of transfer - the effective siblings/parents are higher in the table. Remember about some rules

By MIME type:
MIME type No of
requests
Xfer (kB) Requests
%
Xfer % Xfer speed
(kB/s)
application/zip 16 14821 0.16 22.32 2.08
image/gif 3771 14275 36.77 21.50 1.01
text/html 2108 11243 20.55 16.93 0.61
image/jpeg 1489 11142 14.52 16.78 0.84
www/unknown 2409 7830 23.49 11.79 0.22
application/octet-stream 67 2264 0.65 3.41 5.93
text/plain 272 1290 2.65 1.94 0.59
application/x-zip-compressed 1 1216 0.01 1.83 0.92
audio/mpeg 4 824 0.04 1.24 2.15
application/cache-digest 11 475 0.11 0.72 35.74
application/x-compress 1 214 0.01 0.32 0.24
audio/x-wav 3 170 0.03 0.26 1.83
x-world/x-vrml 1 115 0.01 0.17 5.51
image/tiff 1 113 0.01 0.17 0.10
application/msword 7 101 0.07 0.15 157.76
audio/basic 3 95 0.03 0.14 1.82
application/x-javascript 51 85 0.50 0.13 0.40
application/pdf 4 72 0.04 0.11 198.41
text/css 31 19 0.30 0.03 0.92
image/jpg 1 18 0.01 0.03 8.81
application/x-cdf 1 4 0.01 0.01 0.48
application/vnd.rn-realplayer 2 3 0.02 0.00 0.62
application/x-httpd-cgi 6 1 0.06 0.00 0.13

This table shows you detailed statistics about efficiency of fetching objects of different types. It is useful for making different refresh_patterns in your squid.conf. This does make a sense while refreshing different types of objects - rather large like pictures and movies and changed rarely - and smaller and changed often - like HTML documents. As you can imagine relaxing the refreshing rules for the first ones will raise your byte hit ratio and speed of service without large risk of staleness, while the second ones still need the tight refreshing rules. From that table you could see the impact of the different object types for your web cache server and it's efficiency. This is going to be replaced by the table of refresh patterns hit characteristics in future.

By cache result codes:

Hit type Req
by hit
Req % Xfer (kB) Xfer % Xfer speed
kB/s
NONE 3 0.03 3 0.01 21.65
TCP_REFRESH_MISS 276 2.69 14786 22.27 2.12
TCP_REFRESH_HIT 114 1.11 526 0.79 0.52
TCP_MEM_HIT 893 8.71 2161 3.26 46.91
TCP_MISS 7608 74.17 40303 60.70 0.47
TCP_IMS_HIT 719 7.01 359 0.54 11.13
TCP_HIT 605 5.90 8235 12.40 3.45
TCP_NEGATIVE_HIT 42 0.41 23 0.04 9.07

This table shows the characteristics for different cache result codes. The only useful information I have found is the difference between TCP_MEM_HIT (which are objects fetched from squid's RAM buffer) against TCP_HIT (which are fetched from disk buffer). Low value for TCP_HIT displays the need for dedicate more RAM for Squid and/or rearranging the cache_dir layout - spreading several cache directories over several disk/controllers or any further disk performance improvement.

By peer status:

Fetch type Req
by fetch
Req % Xfer (kB) Xfer % Xfer speed
kB/s
NONE 2471 24.09 11092 16.71 1.24
TIMEOUT_DIRECT 2146 20.92 13179 19.85 0.35
TIMEOUT_FIRST_UP_PARENT 27 0.26 l60 0.09 0.29
DIRECT 4100 39.97 37175 55.99 0.78
FIRST_PARENT_MISS 11 0.11 0 0.00 0.00
SIBLING_HIT 446 4.35 1725 2.60 2.61
FIRST_UP_PARENT 19 0.19 33 0.05 0.11
PARENT_HIT 959 9.35 3030 4.56 3.76
CACHE_DIGEST_HIT 81 0.79 103 0.16 0.33

The basic stuff showing you advantages/disadvantages of using parent and sibling relationship. Ue with caution! Remember that your parents and siblings could be available via lines of different speed and quality. Using those aggregates for analysis may cause false conclusions.
The strong point of this stuff is opportunity to compare cache digests versus ICP, if there are siblings communicating both ways in similar environment.
No idea how to use that more:( Any suggestions?

Feedback

If you have any questions, comments, flames, seen any bugs, send it to maciej_kozinski@yahoo.com.



Back to my personal page



1