Search Results

Search found 1 results on 1 pages for 'carmao'.

Page 1/1 | 1 

  • use proxy in python to fetch a webpage

    - by carmao
    I am trying to write a function in Python to use a public anonymous proxy and fetch a webpage, but I got a rather strange error. The code (I have Python 2.4): import urllib2 def get_source_html_proxy(url, pip, timeout): # timeout in seconds (maximum number of seconds willing for the code to wait in # case there is a proxy that is not working, then it gives up) proxy_handler = urllib2.ProxyHandler({'http': pip}) opener = urllib2.build_opener(proxy_handler) opener.addheaders = [('User-agent', 'Mozilla/5.0')] urllib2.install_opener(opener) req=urllib2.Request(url) sock=urllib2.urlopen(req) timp=0 # a counter that is going to measure the time until the result (webpage) is # returned while 1: data = sock.read(1024) timp=timp+1 if len(data) < 1024: break timpLimita=50000000 * timeout if timp==timpLimita: # 5 millions is about 1 second break if timp==timpLimita: print IPul + ": Connection is working, but the webpage is fetched in more than 50 seconds. This proxy returns the following IP: " + str(data) return str(data) else: print "This proxy " + IPul + "= good proxy. " + "It returns the following IP: " + str(data) return str(data) # Now, I call the function to test it for one single proxy (IP:port) that does not support user and password (a public high anonymity proxy) #(I put a proxy that I know is working - slow, but is working) rez=get_source_html_proxy("http://www.whatismyip.com/automation/n09230945.asp", "93.84.221.248:3128", 50) print rez The error: Traceback (most recent call last): File "./public_html/cgi-bin/teste5.py", line 43, in ? rez=get_source_html_proxy("http://www.whatismyip.com/automation/n09230945.asp", "93.84.221.248:3128", 50) File "./public_html/cgi-bin/teste5.py", line 18, in get_source_html_proxy sock=urllib2.urlopen(req) File "/usr/lib64/python2.4/urllib2.py", line 130, in urlopen return _opener.open(url, data) File "/usr/lib64/python2.4/urllib2.py", line 358, in open response = self._open(req, data) File "/usr/lib64/python2.4/urllib2.py", line 376, in _open '_open', req) File "/usr/lib64/python2.4/urllib2.py", line 337, in _call_chain result = func(*args) File "/usr/lib64/python2.4/urllib2.py", line 573, in lambda r, proxy=url, type=type, meth=self.proxy_open: \ File "/usr/lib64/python2.4/urllib2.py", line 580, in proxy_open if '@' in host: TypeError: iterable argument required I do not know why the character "@" is an issue (I have no such in my code. Should I have?) Thanks in advance for your valuable help.

    Read the article

1