I'm trying to make a web scraper that will parse a web-page of publications and extract the authors. The skeletal structure of the web-page is the following:
<html>
<body>
<div id="container">
<div id="contents">
<table>
<tbody>
<tr>
<td class="author">####I want whatever is located here ###</td>
</tr>
</tbody>
</table>
</div>
</div>
</body>
</html>
I've been trying to use BeautifulSoup and lxml thus far to accomplish this task, but I'm not sure how to handle the two div tags and td tag because they have attributes. In addition to this, I'm not sure whether I should rely more on BeautifulSoup or lxml or a combination of both. What should I do?
At the moment, my code looks like what is below:
    import re
    import urllib2,sys
    import lxml
    from lxml import etree
    from lxml.html.soupparser import fromstring
    from lxml.etree import tostring
    from lxml.cssselect import CSSSelector
    from BeautifulSoup import BeautifulSoup, NavigableString
    address='http://www.example.com/'
    html = urllib2.urlopen(address).read()
    soup = BeautifulSoup(html)
    html=soup.prettify()
    html=html.replace(' ', ' ')
    html=html.replace('í','í')
    root=fromstring(html)
I realize that a lot of the import statements may be redundant, but I just copied whatever I currently had in more source file.
EDIT: I suppose that I didn't make this quite clear, but I have multiple  tags in page that I want to scrape.