Search Results

Search found 1 results on 1 pages for 'badc0re'.

Page 1/1 | 1 

  • Url open encoding

    - by badc0re
    I have the following code for urllib and BeautifulSoup: getSite = urllib.urlopen(pageName) # open current site getSitesoup = BeautifulSoup(getSite.read()) # reading the site content print getSitesoup.originalEncoding for value in getSitesoup.find_all('link'): # extract all <a> tags defLinks.append(value.get('href')) The result of it: /usr/lib/python2.6/site-packages/bs4/dammit.py:231: UnicodeWarning: Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER. "Some characters could not be decoded, and were " And when i try to read the site i get: ?7?e????0*"I??G?H????F??????9-??????;??E?YÞBs????????????4i???)?????^W?????`w?Ke??%??*9?.'OQB???V??@?????]???(P??^??q?$?S5???tT*?Z

    Read the article

1