How to convert an HTML table to an array in python

Posted by user345660 on Stack Overflow See other posts from Stack Overflow or by user345660
Published on 2010-05-20T02:29:58Z Indexed on 2010/05/20 2:30 UTC
Read the original article Hit count: 291

Filed under:
|

I have an html document, and I want to pull the tables out of this document and return them as arrays. I'm picturing 2 functions, one that finds all the html tables in a document, and a second one that turns html tables into 2-dimensional arrays.

Something like this:

htmltables = get_tables(htmldocument) for table in htmltables: array=make_array(table)

There's 2 catches: 1. The number tables varies day to day 2. The tables have all kinds of weird extra formatting, like bold and blink tags, randomly thrown in.

Thanks!

© Stack Overflow or respective owner

Related posts about python

Related posts about html