Apache's htcacheclean doesn't scale: How to tame a huge Apache disk_cache?

Posted by flight on Server Fault See other posts from Server Fault or by flight
Published on 2011-10-12T11:04:06Z Indexed on 2012/07/08 21:18 UTC
Read the original article Hit count: 165

We have an Apache setup with a huge disk_cache (>500.000 entries, >50 GB disk space used). The cache grows by 16 GB every day.

My problem is that the cache seems to be growing nearly as fast as it's possible to remove files and directories from the cache filesystem!

The cache partition is an ext3 filesystem (100GB, "-t news") on an iSCSI storage. The Apache server (which acts as a caching proxy) is a VM. The disk_cache is configured with CacheDirLevels=2 and CacheDirLength=1, and includes variants. A typical file path is "/htcache/B/x/i_iGfmmHhxJRheg8NHcQ.header.vary/A/W/oGX3MAV3q0bWl30YmA_A.header".

When I try to call htcacheclean to tame the cache (non-daemon mode, "htcacheclean-t -p/htcache -l15G"), IOwait is going through the roof for several hours. Without any visible action. Only after hours, htcacheclean starts to delete files from the cache partition, which takes a couple more hours. (A similar problem was brought up in the Apache mailing list in 2009, without a solution: http://www.mail-archive.com/[email protected]/msg42683.html)

The high IOwait leads to problems with the stability of the web server (the bridge to the Tomcat backend server sometimes stalls).

I came up with my own prune script, which removes files and directories from random subdirectories of the cache. Only to find that the deletion rate of the script is just slightly higher than the cache growth rate. The script takes ~10 seconds to read the a subdirectory (e.g. /htcache/B/x) and frees some 5 MB of disk space. In this 10 seconds, the cache has grown by another 2 MB. As with htcacheclean, IOwait goes up to 25% when running the prune script continuously.

Any idea?

  • Is this a problem specific to the (rather slow) iSCSI storage?

  • Should I choose a different file system for a huge disk_cache? ext2? ext4?

  • Are there any kernel parameter optimizations for this kind of scenario? (I already tried the deadline scheduler and a smaller read_ahead_kb, without effect).

© Server Fault or respective owner

Related posts about apache2

Related posts about vmware-esx