Squid proxy not serving modified html content

Posted by Matthew on Stack Overflow See other posts from Stack Overflow or by Matthew
Published on 2010-03-24T08:29:48Z Indexed on 2010/03/24 8:33 UTC
Read the original article Hit count: 587

Filed under:
|
|
|
|

I'm trying to use squid to modify the page content of web page requests. I followed the upside-down-ternet tutorial which showed instructions for how to flip images on pages.

I need to change the actual html of the page. I've been trying to do the same thing as in the tutorial, but instead of editing the image I'm trying to edit the html page. Below is a php script I'm using to try to do it.

All jpg images get flipped, but the content on the page does not get edited. The edited index.html files written contain the edited content, but the pages the users receive don't contain the edited content.

#!/usr/bin/php
<?php
$temp = array();
while ( $input = fgets(STDIN) ) {
    $micro_time = microtime();

    // Split the output (space delimited) from squid into an array.
    $temp = split(' ', $input);

    //Flip jpg images, this works correctly
    if (preg_match("/.*\.jpg/i", $temp[0])) {
        system("/usr/bin/wget -q -O /var/www/cache/$micro_time.jpg ". $temp[0]);
        system("/usr/bin/mogrify -flip /var/www/cache/$micro_time.jpg");
        echo "http://127.0.0.1/cache/$micro_time.jpg\n";
    }

    //Don't edit files that are obviously not html. $temp[0] contains url of file to get
    elseif (preg_match("/(jpg|png|gif|css|js|\(|\))/i", $temp[0], $matches)) {
        echo $input;
    }   

    //Otherwise, could be html (e.g. `wget http://www.google.com` downloads index.html)
    else{ 
        $time = time() . microtime();       //For unique directory names
        $time = preg_replace("/ /", "", $time); //Simplify things by removing the spaces
        mkdir("/var/www/cache/". $time);    //Create unique folder
        system("/usr/bin/wget -q --directory-prefix=\"/var/www/cache/$time/\" ". $temp[0]);
        $filename = system("ls /var/www/cache/$time/");     //Get filename of downloaded file

        //File is html, edit the content (this does not work)
        if(preg_match("/.*\.html/", $filename)){

            //Get the html file contents  
            $contentfh = fopen("/var/www/cache/$time/". $filename, 'r');
            $content = fread($contentfh, filesize("/var/www/cache/$time/". $filename));
            fclose($contentfh);

            //Edit the html file contents
            $content = preg_replace("/<\/body>/i", "<!-- content served by proxy --></body>", $content);

            //Write the edited file
            $contentfh = fopen("/var/www/cache/$time/". $filename, 'w');
            fwrite($contentfh, $content);
            fclose($contentfh);

            //Return the edited page
            echo "http://127.0.0.1/cache/$time/$filename\n";
        }               
        //Otherwise file is not html, don't edit
        else{
            echo $input;
        }
    }
}
?>

© Stack Overflow or respective owner

Related posts about squid

Related posts about modify