Database Compression in Python
        Posted  
        
            by 
                user551832
            
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by user551832
        
        
        
        Published on 2010-12-24T20:23:46Z
        Indexed on 
            2010/12/24
            21:54 UTC
        
        
        Read the original article
        Hit count: 220
        
python
I have hourly logs like
user1:joined
user2:log out
user1:added pic
user1:added comment
user3:joined
I want to compress all the flat files down to one file. There are around 30 million users in the logs and I just want the latest user log for all the logs.
My end result is I want to have a log look like
user1:added comment
user2:log out
user3:joined
Now my first attempt on a small scale was to just do a dict like
log['user1'] = "added comment"
Will doing a dict of 30 million key/val pairs have a giant memory footprint.. Or should I use something like sqllite to store them.. then just put the contents of the sqllite table back into a file?
© Stack Overflow or respective owner