Capturing and Transforming ASP.NET Output with Response.Filter

Posted by Rick Strahl on West-Wind See other posts from West-Wind or by Rick Strahl
Published on Fri, 13 Nov 2009 17:47:39 GMT Indexed on 2010/03/07 23:12 UTC
Read the original article Hit count: 1040

Filed under:

During one of my Handlers and Modules session at DevConnections this week one of the attendees asked a question that I didn’t have an immediate answer for. Basically he wanted to capture response output completely and then apply some filtering to the output – effectively injecting some additional content into the page AFTER the page had completely rendered. Specifically the output should be captured from anywhere – not just a page and have this code injected into the page.

Some time ago I posted some code that allows you to capture ASP.NET Page output by overriding the Render() method, capturing the HtmlTextWriter() and reading its content, modifying the rendered data as text then writing it back out. I’ve actually used this approach on a few occasions and it works fine for ASP.NET pages. But this obviously won’t work outside of the Page class environment and it’s not really generic – you have to create a custom page class in order to handle the output capture.

[updated 11/16/2009 – updated ResponseFilterStream implementation and a few additional notes based on comments]

Enter Response.Filter

However, ASP.NET includes a Response.Filter which can be used – well to filter output. Basically Response.Filter is a stream through which the OutputStream is piped back to the Web Server (indirectly). As content is written into the Response object, the filter stream receives the appropriate Stream commands like Write, Flush and Close as well as read operations although for a Response.Filter that’s uncommon to be hit. The Response.Filter can be programmatically replaced at runtime which allows you to effectively intercept all output generation that runs through ASP.NET.

A common Example: Dynamic GZip Encoding

A rather common use of Response.Filter hooking up code based, dynamic  GZip compression for requests which is dead simple by applying a GZipStream (or DeflateStream) to Response.Filter. The following generic routines can be used very easily to detect GZip capability of the client and compress response output with a single line of code and a couple of library helper routines:

WebUtils.GZipEncodePage();
which is handled with a few lines of reusable code and a couple of static helper methods:

/// <summary>
///
Sets up the current page or handler to use GZip through a Response.Filter
///
IMPORTANT: 
///
You have to call this method before any output is generated!
/// </summary>
public static void GZipEncodePage()
{
    HttpResponse Response = HttpContext.Current.Response;

    if(IsGZipSupported())
    {
        stringAcceptEncoding = HttpContext.Current.Request.Headers["Accept-Encoding"];
        if(AcceptEncoding.Contains("deflate"))
        {
            Response.Filter = newSystem.IO.Compression.DeflateStream(Response.Filter,
                                       System.IO.Compression.CompressionMode.Compress);
            Response.AppendHeader("Content-Encoding", "deflate");
        }
       
else
      
{
            Response.Filter = newSystem.IO.Compression.GZipStream(Response.Filter,
                                      System.IO.Compression.CompressionMode.Compress);
            Response.AppendHeader("Content-Encoding", "gzip");                   
        }
    }

   
// Allow proxy servers to cache encoded and unencoded versions separately
  
Response.AppendHeader("Vary", "Content-Encoding");
}
/// <summary>
/// Determines if GZip is supported
/// </summary>
/// <returns></returns>
public static bool IsGZipSupported()
{
    string AcceptEncoding = HttpContext.Current.Request.Headers["Accept-Encoding"];
    if (!string.IsNullOrEmpty(AcceptEncoding) &&
         (AcceptEncoding.Contains("gzip") || AcceptEncoding.Contains("deflate")))
        return true;
    return false;
}

GZipStream and DeflateStream are streams that are assigned to Response.Filter and by doing so apply the appropriate compression on the active Response.

Response.Filter content is chunked

So to implement a Response.Filter effectively requires only that you implement a custom stream and handle the Write() method to capture Response output as it’s written. At first blush this seems very simple – you capture the output in Write, transform it and write out the transformed content in one pass. And that indeed works for small amounts of content. But you see, the problem is that output is written in small buffer chunks (a little less than 16k it appears) rather than just a single Write() statement into the stream, which makes perfect sense for ASP.NET to stream data back to IIS in smaller chunks to minimize memory usage en route.

Unfortunately this also makes it a more difficult to implement any filtering routines since you don’t directly get access to all of the response content which is problematic especially if those filtering routines require you to look at the ENTIRE response in order to transform or capture the output as is needed for the solution the gentleman in my session asked for.

So in order to address this a slightly different approach is required that basically captures all the Write() buffers passed into a cached stream and then making the stream available only when it’s complete and ready to be flushed.

As I was thinking about the implementation I also started thinking about the few instances when I’ve used Response.Filter implementations. Each time I had to create a new Stream subclass and create my custom functionality but in the end each implementation did the same thing – capturing output and transforming it. I thought there should be an easier way to do this by creating a re-usable Stream class that can handle stream transformations that are common to Response.Filter implementations.

Creating a semi-generic Response Filter Stream Class

What I ended up with is a ResponseFilterStream class that provides a handful of Events that allow you to capture and/or transform Response content. The class implements a subclass of Stream and then overrides Write() and Flush() to handle capturing and transformation operations. By exposing events it’s easy to hook up capture or transformation operations via single focused methods.

ResponseFilterStream exposes the following events:

  • CaptureStream, CaptureString
    Captures the output only and provides either a MemoryStream or String with the final page output. Capture is hooked to the Flush() operation of the stream.
  • TransformStream, TransformString
    Allows you to transform the complete response output with events that receive a MemoryStream or String respectively and can you modify the output then return it back as a return value. The transformed output is then written back out in a single chunk to the response output stream. These events capture all output internally first then write the entire buffer into the response.
  • TransformWrite, TransformWriteString
    Allows you to transform the Response data as it is written in its original chunk size in the Stream’s Write() method. Unlike TransformStream/TransformString which operate on the complete output, these events only see the current chunk of data written. This is more efficient as there’s no caching involved, but can cause problems due to searched content splitting over multiple chunks.

Using this implementation, creating a custom Response.Filter transformation becomes as simple as the following code.

To hook up the Response.Filter using the MemoryStream version event:

ResponseFilterStream filter = new ResponseFilterStream(Response.Filter);
filter.TransformStream += filter_TransformStream;
Response.Filter = filter;  

and the event handler to do the transformation:

MemoryStream filter_TransformStream(MemoryStream ms)
{
    Encoding encoding = HttpContext.Current.Response.ContentEncoding;
    
    string output = encoding.GetString(ms.ToArray());

    output = FixPaths(output);
    
    ms = new MemoryStream(output.Length);

    byte[] buffer = encoding.GetBytes(output);
    ms.Write(buffer,0,buffer.Length);

    return ms;
}
private string FixPaths(string output)
{
    string path = HttpContext.Current.Request.ApplicationPath;


// override root path wonkiness if (path == "/") path = ""; output = output.Replace("\"~/", "\"" + path + "/").Replace("'~/", "'" + path + "/"); return output; }

The idea of the event handler is that you can do whatever you want to the stream and return back a stream – either the same one that’s been modified or a brand new one – which is then sent back to as the final response.

The above code can be simplified even more by using the string version events which handle the stream to string conversions for you:

ResponseFilterStream filter = new ResponseFilterStream(Response.Filter);
filter.TransformString += filter_TransformString;
Response.Filter = filter;                

and the event handler to do the transformation calling the same FixPaths method shown above:

string filter_TransformString(string output)
{
    return FixPaths(output);
}

The events for capturing output and capturing and transforming chunks work in a very similar way. By using events to handle the transformations ResponseFilterStream becomes a reusable component and we don’t have to create a new stream class or subclass an existing Stream based classed.

By the way, the example used here is kind of a cool trick which transforms “~/” expressions inside of the final generated HTML output – even in plain HTML controls not HTML controls – and transforms them into the appropriate application relative path in the same way that ResolveUrl would do.

So you can write plain old HTML like this:

<a href=”~/default.aspx”>Home</a> 

and have it turned into:

<a href=”/myVirtual/default.aspx”>Home</a> 

without having to use an ASP.NET control like Hyperlink or Image or having to constantly use:

<img src=”<%= ResolveUrl(“~/images/home.gif”) %>” />

in MVC applications (which frankly is one of the most annoying things about MVC especially given the path hell that extension-less and endpoint-less URLs impose).

I can’t take credit for this idea. While discussing the Response.Filter issues on Twitter a hint from Dylan Beattie who pointed me at one of his examples which does something similar. I thought the idea was cool enough to use an example for future demos of Response.Filter functionality in ASP.NET next I time I do the Modules and Handlers talk (which was great fun BTW).

How practical this is is debatable however since there’s definitely some overhead to using a Response.Filter in general and especially on one that caches the output and the re-writes it later. Make sure to test for performance anytime you use Response.Filter hookup and make sure it' doesn’t end up killing perf on you. You’ve been warned :-}.

How does ResponseFilterStream work?

The big win of this implementation IMHO is that it’s a reusable  component – so for implementation there’s no new class, no subclassing – you simply attach to an event to implement an event handler method with a straight forward signature to retrieve the stream or string you’re interested in.

The implementation is based on a subclass of Stream as is required in order to handle the Response.Filter requirements. What’s different than other implementations I’ve seen in various places is that it supports capturing output as a whole to allow retrieving the full response output for capture or modification. The exception are the TransformWrite and TransformWrite events which operate only active chunk of data written by the Response.

For captured output, the Write() method captures output into an internal MemoryStream that is cached until writing is complete. So Write() is called when ASP.NET writes to the Response stream, but the filter doesn’t pass on the Write immediately to the filter’s internal stream. The data is cached and only when the Flush() method is called to finalize the Stream’s output do we actually send the cached stream off for transformation (if the events are hooked up) and THEN finally write out the returned content in one big chunk.

Here’s the implementation of ResponseFilterStream:

/// <summary>
/// A semi-generic Stream implementation for Response.Filter with
/// an event interface for handling Content transformations via
/// Stream or String.    
/// <remarks>
/// Use with care for large output as this implementation copies
/// the output into a memory stream and so increases memory usage.
/// </remarks>
/// </summary>    
public class ResponseFilterStream : Stream
{
    /// <summary>
    /// The original stream
    /// </summary>
    Stream _stream;

    /// <summary>
    /// Current position in the original stream
    /// </summary>
    long _position;

    /// <summary>
    /// Stream that original content is read into
    /// and then passed to TransformStream function
    /// </summary>
    MemoryStream _cacheStream = new MemoryStream(5000);

    /// <summary>
    /// Internal pointer that that keeps track of the size
    /// of the cacheStream
    /// </summary>
    int _cachePointer = 0;


    /// <summary>
    /// 
    /// </summary>
    /// <param name="responseStream"></param>
    public ResponseFilterStream(Stream responseStream)
    {
        _stream = responseStream;
    }


    /// <summary>
    /// Determines whether the stream is captured
    /// </summary>
    private bool IsCaptured
    {
        get 
        {

            if (CaptureStream != null || CaptureString != null ||
                TransformStream != null || TransformString != null)
                return true;

            return false;
        }
    }

    /// <summary>
    /// Determines whether the Write method is outputting data immediately
    /// or delaying output until Flush() is fired.
    /// </summary>
    private bool IsOutputDelayed
    {
        get 
        {
            if (TransformStream != null || TransformString != null)
                return true;

            return false;
        }        
    }
    

    /// <summary>
    /// Event that captures Response output and makes it available
    /// as a MemoryStream instance. Output is captured but won't 
    /// affect Response output.
    /// </summary>
    public event Action<MemoryStream> CaptureStream;

    /// <summary>
    /// Event that captures Response output and makes it available
    /// as a string. Output is captured but won't affect Response output.
    /// </summary>
    public event Action<string> CaptureString;

    

    /// <summary>
    /// Event that allows you transform the stream as each chunk of
    /// the output is written in the Write() operation of the stream.
    /// This means that that it's possible/likely that the input 
    /// buffer will not contain the full response output but only
    /// one of potentially many chunks.
    /// 
    /// This event is called as part of the filter stream's Write() 
    /// operation.
    /// </summary>
    public event Func<byte[], byte[]> TransformWrite;


    /// <summary>
    /// Event that allows you to transform the response stream as
    /// each chunk of bytep[] output is written during the stream's write
    /// operation. This means it's possibly/likely that the string
    /// passed to the handler only contains a portion of the full
    /// output. Typical buffer chunks are around 16k a piece.
    /// 
    /// This event is called as part of the stream's Write operation.
    /// </summary>
    public event Func<string, string> TransformWriteString;

    /// <summary>
    /// This event allows capturing and transformation of the entire 
    /// output stream by caching all write operations and delaying final
    /// response output until Flush() is called on the stream.
    /// </summary>
    public event Func<MemoryStream, MemoryStream> TransformStream;

    /// <summary>
    /// Event that can be hooked up to handle Response.Filter
    /// Transformation. Passed a string that you can modify and
    /// return back as a return value. The modified content
    /// will become the final output.
    /// </summary>
    public event Func<string, string> TransformString;


    protected virtual void OnCaptureStream(MemoryStream ms)
    {
        if (CaptureStream != null)
            CaptureStream(ms);
    }


    private void OnCaptureStringInternal(MemoryStream ms)
    {
        if (CaptureString != null)
        {
            string content = HttpContext.Current.Response.ContentEncoding.GetString(ms.ToArray());
            OnCaptureString(content);
        }
    }

    protected virtual void OnCaptureString(string output)
    {
        if (CaptureString != null)
            CaptureString(output);
    }

    protected virtual byte[] OnTransformWrite(byte[] buffer)
    {
        if (TransformWrite != null)
            return TransformWrite(buffer);
        return buffer;
    }

    private byte[] OnTransformWriteStringInternal(byte[] buffer)
    {
        Encoding encoding = HttpContext.Current.Response.ContentEncoding;
        string output = OnTransformWriteString(encoding.GetString(buffer));
        return encoding.GetBytes(output);
    }

    private string OnTransformWriteString(string value)
    {
        if (TransformWriteString != null)
            return TransformWriteString(value);
        return value;
    }


    protected virtual MemoryStream OnTransformCompleteStream(MemoryStream ms)
    {
        if (TransformStream != null)
            return TransformStream(ms);

        return ms;
    }




    /// <summary>
    /// Allows transforming of strings
    /// 
    /// Note this handler is internal and not meant to be overridden
    /// as the TransformString Event has to be hooked up in order
    /// for this handler to even fire to avoid the overhead of string
    /// conversion on every pass through.
    /// </summary>
    /// <param name="responseText"></param>
    /// <returns></returns>
    private string OnTransformCompleteString(string responseText)
    {
        if (TransformString != null)
            TransformString(responseText);

        return responseText;
    }

    /// <summary>
    /// Wrapper method form OnTransformString that handles
    /// stream to string and vice versa conversions
    /// </summary>
    /// <param name="ms"></param>
    /// <returns></returns>
    internal MemoryStream OnTransformCompleteStringInternal(MemoryStream ms)
    {
        if (TransformString == null)
            return ms;

        //string content = ms.GetAsString();
        string content = HttpContext.Current.Response.ContentEncoding.GetString(ms.ToArray());

        content = TransformString(content);
        byte[] buffer = HttpContext.Current.Response.ContentEncoding.GetBytes(content);
        ms = new MemoryStream();
        ms.Write(buffer, 0, buffer.Length);
        //ms.WriteString(content);

        return ms;
    }

    /// <summary>
    /// 
    /// </summary>
    public override bool CanRead
    {
        get { return true; }
    }

    public override bool CanSeek
    {
        get { return true; }
    }
    /// <summary>
    /// 
    /// </summary>
    public override bool CanWrite
    {
        get { return true; }
    }

    /// <summary>
    /// 
    /// </summary>
    public override long Length
    {
        get { return 0; }
    }

    /// <summary>
    /// 
    /// </summary>
    public override long Position
    {
        get { return _position; }
        set { _position = value; }
    }

    /// <summary>
    /// 
    /// </summary>
    /// <param name="offset"></param>
    /// <param name="direction"></param>
    /// <returns></returns>
    public override long Seek(long offset, System.IO.SeekOrigin direction)
    {
        return _stream.Seek(offset, direction);
    }

    /// <summary>
    /// 
    /// </summary>
    /// <param name="length"></param>
    public override void SetLength(long length)
    {
        _stream.SetLength(length);
    }

    /// <summary>
    /// 
    /// </summary>
    public override void Close()
    {
        _stream.Close();
    }

    /// <summary>
    /// Override flush by writing out the cached stream data
    /// </summary>
    public override void Flush()
    {

        if (IsCaptured && _cacheStream.Length > 0)
        {
            // Check for transform implementations
            _cacheStream = OnTransformCompleteStream(_cacheStream);
            _cacheStream = OnTransformCompleteStringInternal(_cacheStream);
            
            OnCaptureStream(_cacheStream);
            OnCaptureStringInternal(_cacheStream);

            // write the stream back out if output was delayed
            if (IsOutputDelayed)
                _stream.Write(_cacheStream.ToArray(), 0, (int)_cacheStream.Length);

            // Clear the cache once we've written it out
            _cacheStream.SetLength(0);
        }

        // default flush behavior
        _stream.Flush();
    }

    /// <summary>
    /// 
    /// </summary>
    /// <param name="buffer"></param>
    /// <param name="offset"></param>
    /// <param name="count"></param>
    /// <returns></returns>
    public override int Read(byte[] buffer, int offset, int count)
    {
        return _stream.Read(buffer, offset, count);
    }


    /// <summary>
    /// Overriden to capture output written by ASP.NET and captured
    /// into a cached stream that is written out later when Flush()
    /// is called.
    /// </summary>
    /// <param name="buffer"></param>
    /// <param name="offset"></param>
    /// <param name="count"></param>
    public override void Write(byte[] buffer, int offset, int count)
    {
        if ( IsCaptured )
        {
            // copy to holding buffer only - we'll write out later
            _cacheStream.Write(buffer, 0, count);
            _cachePointer += count;
        }

        // just transform this buffer
        if (TransformWrite != null)
            buffer = OnTransformWrite(buffer);
        if (TransformWriteString != null)
            buffer = OnTransformWriteStringInternal(buffer);

        if (!IsOutputDelayed)
            _stream.Write(buffer, offset, buffer.Length);
        
    }

}

The key features are the events and corresponding OnXXX methods that handle the event hookups, and the Write() and Flush() methods of the stream implementation. All the rest of the members tend to be plain jane passthrough stream implementation code without much consequence.

I do love the way Action<t> and Func<T> make it so easy to create the event signatures for the various events – sweet.

A few Things to consider

Performance

Response.Filter is not great for performance in general as it adds another layer of indirection to the ASP.NET output pipeline, and this implementation in particular adds a memory hit as it basically duplicates the response output into the cached memory stream which is necessary since you may have to look at the entire response. If you have large pages in particular this can cause potentially serious memory pressure in your server application. So be careful of wholesale adoption of this (or other) Response.Filters. Make sure to do some performance testing to ensure it’s not killing your app’s performance.

Response.Filter works everywhere

A few questions came up in comments and discussion as to capturing ALL output hitting the site and – yes you can definitely do that by assigning a Response.Filter inside of a module. If you do this however you’ll want to be very careful and decide which content you actually want to capture especially in IIS 7 which passes ALL content – including static images/CSS etc. through the ASP.NET pipeline. So it is important to filter only on what you’re looking for – like the page extension or maybe more effectively the Response.ContentType.

Response.Filter Chaining

Originally I thought that filter chaining doesn’t work at all due to a bug in the stream implementation code. But it’s quite possible to assign multiple filters to the Response.Filter property. So the following actually works to both compress the output and apply the transformed content:

WebUtils.GZipEncodePage();

ResponseFilterStream filter = new ResponseFilterStream(Response.Filter);
filter.TransformString += filter_TransformString;
Response.Filter = filter;                

However the following does not work resulting in invalid content encoding errors:

ResponseFilterStream filter = new ResponseFilterStream(Response.Filter);
filter.TransformString += filter_TransformString;
Response.Filter = filter;

WebUtils.GZipEncodePage();

In other words multiple Response filters can work together but it depends entirely on the implementation whether they can be chained or in which order they can be chained. In this case running the GZip/Deflate stream filters apparently relies on the original content length of the output and chokes when the content is modified. But if attaching the compression first it works fine as unintuitive as that may seem.

Resources

© Rick Strahl, West Wind Technologies, 2005-2010
Posted in ASP.NET  
kick it on DotNetKicks.com

© West-Wind or respective owner

Related posts about ASP.NET