UKUUG home

UKUUG

(the UK's Unix & Open Systems User Group)

Home

Events

About UKUUG

UKUUG Diary

Membership

Book Discounts

Other Discounts

Mailing lists

Sponsors

Newsletter

Consulting

 


 

Web caching Duane Wessels
Published by O'Reilly and Associates
ISBN:1-56592-536-X
318 pages
£ 39.95
Published: 16th July 2001
reviewed by Andrew Cormack
   in the December 2001 issue (pdf), (html)
bookcover  

As one of the developers of the harvest and squid cache programs, Duane Wessels would be ideally qualified to write a book on the details of running a web cache. However this is not that book. Instead he has taken several steps back from the detail to review the overall design of caching systems and how they fit into a typical network. The content is addressed as much to network managers, web server administrators, content providers, and users as to those who will administer the cache itself.

Caches are not the most familiar of topics, so there is some basic information that all readers will need to know. The first chapter therefore explains the components of the World Wide Web and how caches fit in. There is an introduction to the protocols used by browsers, caches, and servers with a discussion of the advantages and disadvantages of caches in different situations. Issues of privacy, censorship, and copyright often arise in any discussion of caching. Although the specific legislation discussed in the book applies only to the USA the general principles are universal; caching is one area where common sense seems to be bringing laws closer together in any case.

Caches need not work alone and are often combined for performance or reliability. Cache hierarchies tend to be used to improve hit rates or speed of access but these raise both technical and political issues. There is a discussion of the various protocols that caches can use to co-operate and, in appendices, full details of these including the best description I have found of how Bloom filters work. Within an organisation caches can be clustered to increase capacity or reliability but here there is only a discussion of possibilities with no technical detail to help in making the cluster work.

These and other design considerations are brought together in a chapter discussing how to go about buying or building a cache; the list of products in the first chapter may also be of use to those who have reached this stage.

A cache needs to have traffic directed to it and there are a number of ways of achieving this. Conceptually the simplest is to configure browsers to use the cache rather than a direct connection to the web server and there are good instructions for doing this either manually or using an auto-configuration file. Alternatively, network devices can simply intercept web requests and direct them to caches. This has a number of problems, both politically and at the protocol level, but for those who wish to follow this route there are descriptions of what needs to be done and instructions for constructing a suitable device using Linux, FreeBSD or other operating systems.

At the other end of the web connection, servers can do some things to help caches and a lot to hinder them. The caching headers in HTTP/1.1 are all described, though it might be less intimidating to describe the function of the headers before listing all of them rather than after. The reasons for returning correct headers are stated clearly along with instructions for ensuring that the Apache web server does so.

Cache performance is often a concern so the instructions for monitoring, using ucd-snmp and rrdtool, and for benchmarking are very valuable. Benchmarking is complicated and it is all too easy to measure the performance of the test rig rather than the cache itself. An appendix includes useful data on the behaviour of a cache in the real world, but these are taken from a parent cache so show rather different behaviour from a typical first level caches.

Many cache products have good operating manuals; this book's wider perspective should help to avoid expensive or inconvenient mistakes at other stages in the process. It provides a good general introduction to caching and will also provide helpful guidance through the process of designing and obtaining a caching solution. If all web sites followed the suggestions on cache-friendliness then the web could be a much faster and more efficient place.

Back to reviews list

Tel: 01763 273 475
Fax: 01763 273 255
Web: Webmaster
Queries: Ask Here
Join UKUUG Today!

UKUUG Secretariat
PO BOX 37
Buntingford
Herts
SG9 9UQ
More information

Page last modified 03 Apr 2007
Copyright © 1995-2011 UKUUG Ltd.