Python json memory bloat
        Posted  
        
            by 
                Anoop
            
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Anoop
        
        
        
        Published on 2011-06-30T14:39:55Z
        Indexed on 
            2011/06/30
            16:22 UTC
        
        
        Read the original article
        Hit count: 303
        
import json
import time
from itertools import count
def keygen(size):
    for i in count(1):
        s = str(i)
        yield '0' * (size - len(s)) + str(s)
def jsontest(num):
    keys = keygen(20)
    kvjson = json.dumps(dict((keys.next(), '0' * 200) for i in range(num)))
    kvpairs = json.loads(kvjson)
    del kvpairs # Not required. Just to check if it makes any difference                            
    print 'load completed'
jsontest(500000)
while 1:
    time.sleep(1)
Linux top indicates that the python process holds ~450Mb of RAM after completion of 'jsontest' function. If the call to 'json.loads' is omitted then this issue is not observed. A gc.collect after this function execution does releases the memory.
Looks like the memory is not held in any caches or python's internal memory allocator as explicit call to gc.collect is releasing memory.
Is this happening because the threshold for garbage collection (700, 10, 10) was never reached ?
I did put some code after jsontest to simulate threshold. But it didn't help.
© Stack Overflow or respective owner