scraping website with javascript cookie with c#

Posted by erwin on Stack Overflow See other posts from Stack Overflow or by erwin
Published on 2010-04-09T20:01:39Z Indexed on 2010/04/09 20:03 UTC
Read the original article Hit count: 552

Filed under:
|
|

Hi all,

I want to scrap some things from the following site: http://www.conrad.nl/modelspoor

This is my function:

public string SreenScrape(string urlBase, string urlPath) { CookieContainer cookieContainer = new CookieContainer(); HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(urlBase + urlPath); httpWebRequest.CookieContainer = cookieContainer; httpWebRequest.UserAgent = "Mozilla/6.0 (Windows; U; Windows NT 7.0; en-US; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.9 (.NET CLR 3.5.30729)"; WebResponse webResponse = httpWebRequest.GetResponse(); string result = new System.IO.StreamReader(webResponse.GetResponseStream(), Encoding.Default).ReadToEnd(); webResponse.Close();

        if (result.Contains("<frame src="))
        {
            Regex metaregex = new Regex("http:[a-z:/._0-9!?=A-Z&]*",RegexOptions.Multiline);
            result = result.Replace("\r\n", "");
            Match m = metaregex.Match(result);
            string key = m.Groups[0].Value;

            foreach (Match match in metaregex.Matches(result))
            {
                HttpWebRequest redirectHttpWebRequest = (HttpWebRequest)WebRequest.Create(key);
                redirectHttpWebRequest.CookieContainer = cookieContainer;
                webResponse = redirectHttpWebRequest.GetResponse();
                string redirectResponse = new System.IO.StreamReader(webResponse.GetResponseStream(), Encoding.Default).ReadToEnd();
                webResponse.Close();
                return redirectResponse;
            }

        }
        return result;
    }

But when i do this i get a string with an error from the website that it use javascript.

Does anybody know how to fix this?

Kind regards Erwin

© Stack Overflow or respective owner

Related posts about c#

Related posts about JavaScript