Sorting HTML table (with anchor tags and data in cells) in Python

Posted by AJ on Stack Overflow See other posts from Stack Overflow or by AJ
Published on 2010-04-05T14:49:08Z Indexed on 2010/04/05 14:53 UTC
Read the original article Hit count: 446

Filed under:
|
|
|

I have a necessity to sort a given HTML table of the following structure, in Python.

<table>
    <tr>
        <td><a href="#">ABCD</a></td>
        <td>A23BND</td>
        <td><a title="ABCD">345345</td>
    </tr>
    <tr>
        <td><a href="#">EFG</a></td>
        <td>Add4D</td>
        <td><a title="EFG">3432</td>
    </tr>
    <tr>
        <td><a href="#">HG</a></td>
        <td>GJJ778</td>
        <td><a title="HG">2341333</td>
    </tr>

</table>

I am doing something like this:

container = tree.findall("tr")
strOut = ""
data = []
for elem in container:
    key = elem.findtext(colName)
    data.append((key, elem))

data.sort()

The problem is that it sorts by the text inside the . I want to be able to sort by the anchor value and not href.

What can I do to achieve that? Thanks a lot.

© Stack Overflow or respective owner

Related posts about python

Related posts about html