How to read a file with variable multi-row data in Python

Posted by dr.bunsen on Stack Overflow See other posts from Stack Overflow or by dr.bunsen
Published on 2011-11-13T17:36:04Z Indexed on 2011/11/13 17:51 UTC
Read the original article Hit count: 244

Filed under:
|

I have a file that is about 100Mb that looks like this:

#meta data 1    
skadjflaskdjfasljdfalskdjfl
sdkfjhasdlkgjhsdlkjghlaskdj
asdhfk
#meta data 2
jflaksdjflaksjdflkjasdlfjas
ldaksjflkdsajlkdfj
#meta data 3
alsdkjflasdjkfglalaskdjf

This file contains one row of meta data that corresponds to several, variable length data containing only alpha-numeric characters. What is the best way to read this data into a simple list like this:

data = [[#meta data 1, skadjflaskdjfasljdfalskdjflsdkfjhasdlkgjhsdlkjghlaskdjasdhfk],
       [#meta data 2, jflaksdjflaksjdflkjasdlfjasldaksjflkdsajlkdfj],
       [#meta data 3, alsdkjflasdjkfglalaskdjf]]

My initial idea was to use the read() method to read the whole file into memory and then use regular expressions to parse the data into the desired format. Is there a better more pythonic way? All metadata lines start with an octothorpe and all data lines are all alpha-numeric. Thanks!

© Stack Overflow or respective owner

Related posts about python

Related posts about read