How to enable caching on Apache / Ubuntu Linux?
- by Jim Mischel
I have a large (several megabytes) XML file that's updated rather frequently (every 10 minutes or less) and gets a lot of traffic.  I'd like to implement some caching to reduce bandwidth and server load.  Looking at the Apache documents, I see a dizzying array of configuration options that involve various combinations of mod_expires, mod_headers, and mod_cache (and variants).  I end up running in circles and the results aren't what I expect.
I'm comfortable editing the various configuration files if I have some idea what I'm supposed to change.  But at the moment I'm poking around in the dark and that's never a comfortable feeling.  So, perhaps if I describe what I want, somebody here can take me by the hand and say, "This is what you need to do."
Periodically, this file, call it "stuff.xml" is updated and a new version copied to the directory.  The external url would be, for example, http://example.com/stuff.xml.  Understand, this part works.  Whenever I request the file, I get the expected result.  But the file is big and I want to save bandwidth, so first I'd like to implement conditional GET semantics with the If-Modified-Since header.  How do I do this?  I've enabled mod_headers and mod_expired and added the <FilesMatching> section in my httpd.conf as recommended in countless examples I've seen online, but that didn't change the behavior when made a conditional GET request.  I always get a status 200 with the entire document.  So how the heck do I implement this?
That'll cut down on neeless transfers.  I'd also like to limit the amount of data transferred.  Seeing as this is XML, gzipping it should save me 50% or more.  My next step would be to somehow gzip the file and, if it's not too difficult, store it in memory.  That'll cut down on per-access data transfer, and also reduce disk transfers.  So how do I implement this type of caching?
Thanks in advance.