How to use Regular Expression to extract information from a HTML webpage?

Posted by user569248 on Stack Overflow See other posts from Stack Overflow or by user569248
Published on 2011-01-10T01:05:03Z Indexed on 2011/01/14 18:53 UTC
Read the original article Hit count: 108

Filed under:
|
|

How to use Regular Expression to extract the answer "Here is the answer" from a HTML webpage like this?

  <b>Last Question:</b>
  <b>Here is the answer</b>

..:: Update ::..

Thanks everybody!

Here is my solution by using BeautifulSoup since I'm using Python framework:

  response = opener.open(url)
  the_page = response.read()

  soup = BeautifulSoup(''.join(the_page))
  paraText1 = soup.body.find('div', 'div_id', text = u'Last Question:')

  if paraText1:
    answer = paraText1.next

© Stack Overflow or respective owner

Related posts about python

Related posts about html