ASP.NET GZip Encoding Caveats

Posted by Rick Strahl on West-Wind See other posts from West-Wind or by Rick Strahl
Published on Mon, 02 May 2011 10:17:48 GMT Indexed on 2011/06/20 16:23 UTC
Read the original article Hit count: 974

Filed under:
|

GZip encoding in ASP.NET is pretty easy to accomplish using the built-in GZipStream and DeflateStream classes and applying them to the Response.Filter property.  While applying GZip and Deflate behavior is pretty easy there are a few caveats that you have watch out for as I found out today for myself with an application that was throwing up some garbage data. But before looking at caveats let’s review GZip implementation for ASP.NET.

ASP.NET GZip/Deflate Basics

Response filters basically are applied to the Response.OutputStream and transform it as data is written to it through the ASP.NET Response object. So a Response.Write eventually gets written into the output stream which if a filter is also written through the filter stream’s interface. To perform the actual GZip (and Deflate) encoding typically used by Web pages .NET includes the GZipStream and DeflateStream stream classes which can be readily assigned to the Repsonse.OutputStream.

With these two stream classes in place it’s almost trivially easy to create a couple of reusable methods that allow you to compress your HTTP output. In my standard WebUtils utility class (from the West Wind West Wind Web Toolkit) created two static utility methods – IsGZipSupported and GZipEncodePage – that check whether the client supports GZip encoding and then actually encodes the current output (note that although the method includes ‘Page’ in its name this code will work with any ASP.NET output).

/// <summary>
/// Determines if GZip is supported
/// </summary>
/// <returns></returns>
public static bool IsGZipSupported()
{
    string AcceptEncoding = HttpContext.Current.Request.Headers["Accept-Encoding"];
    if (!string.IsNullOrEmpty(AcceptEncoding) &&
            (AcceptEncoding.Contains("gzip") || AcceptEncoding.Contains("deflate")))
        return true;
    return false;
}

/// <summary>
/// Sets up the current page or handler to use GZip through a Response.Filter
/// IMPORTANT:  
/// You have to call this method before any output is generated!
/// </summary>
public static void GZipEncodePage()
{
    HttpResponse Response = HttpContext.Current.Response;

    if (IsGZipSupported())
    {
        string AcceptEncoding = HttpContext.Current.Request.Headers["Accept-Encoding"];

        if (AcceptEncoding.Contains("deflate"))
        {
            Response.Filter = new System.IO.Compression.DeflateStream(Response.Filter,
                                        System.IO.Compression.CompressionMode.Compress);
            Response.Headers.Remove("Content-Encoding");
            Response.AppendHeader("Content-Encoding", "deflate");
        }
        else
        {
            Response.Filter = new System.IO.Compression.GZipStream(Response.Filter,
                                        System.IO.Compression.CompressionMode.Compress);
            Response.Headers.Remove("Content-Encoding");
            Response.AppendHeader("Content-Encoding", "gzip");
        }

    }
}

As you can see the actual assignment of the Filter is as simple as:

Response.Filter = new DeflateStream(Response.Filter, System.IO.Compression.CompressionMode.Compress);

which applies the filter to the OutputStream. You also need to ensure that your response reflects the new GZip or Deflate encoding and ensure that any pages that are cached in Proxy servers can differentiate between pages that were encoded with the various different encodings (or no encoding).

To use this utility function now is trivially easy: In any ASP.NET code that wants to compress its Response output you simply use:

protected void Page_Load(object sender, EventArgs e)
{            
    WebUtils.GZipEncodePage();

    Entry = WebLogFactory.GetEntry();

    var entries = Entry.GetLastEntries(App.Configuration.ShowEntryCount, "pk,Title,SafeTitle,Body,Entered,Feedback,Location,ShowTopAd", "TEntries");
    if (entries == null)
        throw new ApplicationException("Couldn't load WebLog Entries: " + Entry.ErrorMessage);

    this.repEntries.DataSource = entries;
    this.repEntries.DataBind();

}

Here I use an ASP.NET page, but the above WebUtils.GZipEncode() method call will work in any ASP.NET application type including HTTP Handlers. The only requirement is that the filter needs to be applied before any other output is sent to the OutputStream. For example, in my CallbackHandler service implementation by default output over a certain size is GZip encoded. The output that is generated is JSON or XML and if the output is over 5k in size I apply WebUtils.GZipEncode():

if (sbOutput.Length > GZIP_ENCODE_TRESHOLD)
    WebUtils.GZipEncodePage();

Response.ContentType = ControlResources.STR_JsonContentType;
HttpContext.Current.Response.Write(sbOutput.ToString());

Ok, so you probably get the idea: Encoding GZip/Deflate content is pretty easy.

Hold on there Hoss –Watch your Caching

Or is it? There are a few caveats that you need to watch out for when dealing with GZip content. The fist issue is that you need to deal with the fact that some clients don’t support GZip or Deflate content. Most modern browsers support it, but if you have a programmatic Http client accessing your content GZip/Deflate support is by no means guaranteed. For example, WinInet Http clients don’t support GZip out of the box – it has to be explicitly implemented. Other low level HTTP clients on other platforms too don’t support GZip out of the box.

The problem is that your application, your Web Server and Proxy Servers on the Internet might be caching your generated content. If you return content with GZip once and then again without, either caching is not applied or worse the wrong type of content is returned back to the client from a cache or proxy. The result is an unreadable response for *some clients* which is also very hard to debug and fix once in production.

You already saw the issue of Proxy servers addressed in the GZipEncodePage() function:

// Allow proxy servers to cache encoded and unencoded versions separately
Response.AppendHeader("Vary", "Content-Encoding");

This ensures that any Proxy servers also check for the Content-Encoding HTTP Header to cache their content – not just the URL.

The same thing applies if you do OutputCaching in your own ASP.NET code. If you generate output for GZip on an OutputCached page the GZipped content will be cached (either by ASP.NET’s cache or in some cases by the IIS Kernel Cache). But what if the next client doesn’t support GZip? She’ll get served a cached GZip page that won’t decode and she’ll get a page full of garbage. Wholly undesirable. To fix this you need to add some custom OutputCache rules by way of the GetVaryByCustom() HttpApplication method in your global_ASAX file:

public override string GetVaryByCustomString(HttpContext context, string custom)
{
    // Override Caching for compression
    if (custom == "GZIP")
    {
        string acceptEncoding = HttpContext.Current.Response.Headers["Content-Encoding"];
        
        if (string.IsNullOrEmpty(acceptEncoding))
            return "";
        else if (acceptEncoding.Contains("gzip"))
            return "GZIP";
        else if (acceptEncoding.Contains("deflate"))
            return "DEFLATE";
        
        return "";
    }
    
    return base.GetVaryByCustomString(context, custom);
}

In a page that use Output caching you then specify:

<%@ OutputCache Duration="180" VaryByParam="none" VaryByCustom="GZIP" %>

To use that custom rule.

It’s all Fun and Games until ASP.NET throws an Error

Ok, so you’re up and running with GZip, you have your caching squared away and your pages that you are applying it to are jamming along. Then BOOM, something strange happens and you get a lovely garbled page that look like this:

GarbledOutput

Lovely isn’t it?

What’s happened here is that I have WebUtils.GZipEncode() applied to my page, but there’s an error in the page. The error falls back to the ASP.NET error handler and the error handler removes all existing output (good) and removes all the custom HTTP headers I’ve set manually (usually good, but very bad here). Since I applied the Response.Filter (via GZipEncode) the output is now GZip encoded, but ASP.NET has removed my Content-Encoding header, so the browser receives the GZip encoded content without a notification that it is encoded as GZip. The result is binary output. Here’s what Fiddler says about the raw HTTP header output when an error occurs when GZip encoding was applied:

HTTP/1.1 500 Internal Server Error
Cache-Control: private
Content-Type: text/html; charset=utf-8
Date: Sat, 30 Apr 2011 22:21:08 GMT
Content-Length: 2138
Connection: close

?`I?%&/m?{J?J??t??` … binary output striped here

Notice: no Content-Encoding header and that’s why we’re seeing this garbage. ASP.NET has stripped the Content-Encoding header but left our filter intact.

So how do we fix this? In my applications I typically have a global Application_Error handler set up and in this case I’ve been using that. One thing that you can do in the Application_Error handler is explicitly clear out the Response.Filter and set it to null at the top:

protected void Application_Error(object sender, EventArgs e)
{
        // Remove any special filtering especially GZip filtering
        Response.Filter = null;
}

And voila I get my Yellow Screen of Death or my custom generated error output back via uncompressed content. BTW, the same is true for Page level errors handled in Page_Error or ASP.NET MVC Error handling methods in a controller.

Another and possibly even better solution is to check whether a filter is attached just before the headers are sent to the client as pointed out by Adam Schroeder in the comments:

 protected void Application_PreSendRequestHeaders()
{
    // ensure that if GZip/Deflate Encoding is applied that headers are set
    // also works when error occurs if filters are still active
    HttpResponse response = HttpContext.Current.Response;
    if (response.Filter is GZipStream && response.Headers["Content-encoding"] != "gzip")
        response.AppendHeader("Content-encoding", "gzip");
    else if (response.Filter is DeflateStream && response.Headers["Content-encoding"] != "deflate")
        response.AppendHeader("Content-encoding", "deflate");
}

This uses the Application_PreSendRequestHeaders() pipeline event to check for compression encoding in a filter and adjusts the content accordingly. This is actually a better solution since this is generic – it’ll work regardless of how the content is cleaned up. For example, an error Response.Redirect() or short error display might get changed and the filter not cleared and this code actually handles that. Sweet, thanks Adam.

It’s unfortunate that ASP.NET doesn’t natively clear out Response.Filters when an error occurs just as it clears the Response and Headers. I can’t see where leaving a Filter in place in an error situation would make any sense, but hey - this is what it is and it’s easy enough to fix as long as you know where to look. Riiiight!

IIS and GZip

I should also mention that IIS 7 includes good support for compression natively. If you can defer encoding to let IIS perform it for you rather than doing it in your code by all means you should do it! Especially any static or semi-dynamic content that can be made static should be using IIS built-in compression. Dynamic caching is also supported but is a bit more tricky to judge in terms of performance and footprint. John Forsyth has a great article on the benefits and drawbacks of IIS 7 compression which gives some detailed performance comparisons and impact reviews. I’ll post another entry next with some more info on IIS compression since information on it seems to be a bit hard to come by.

Related Content

© Rick Strahl, West Wind Technologies, 2005-2011
Posted in ASP.NET   IIS7  
kick it on DotNetKicks.com

© West-Wind or respective owner

Related posts about ASP.NET

Related posts about iis7