# how to implement a really efficient bitvector sorting in python

Posted
by xiao
on Stack Overflow
See other posts from Stack Overflow
or by xiao

Published on 2010-06-07T17:22:45Z
Indexed on
2010/06/07
17:42 UTC

Read the original article
Hit count: 110

Hello guys!

Actually this is an interesting topic from programming pearls, sorting 10 digits telephone numbers in a limited memory with an efficient algorithm. You can find the whole story here

What I am interested in is just how fast the implementation could be in python. I have done a naive implementation with the module bitvector. The code is as following:

```
from BitVector import BitVector
import timeit
import random
import time
import sys
def sort(input_li):
return sorted(input_li)
def vec_sort(input_li):
bv = BitVector( size = len(input_li) )
for i in input_li:
bv[i] = 1
res_li = []
for i in range(len(bv)):
if bv[i]:
res_li.append(i)
return res_li
if __name__ == "__main__":
test_data = range(int(sys.argv[1]))
print 'test_data size is:', sys.argv[1]
random.shuffle(test_data)
start = time.time()
sort(test_data)
elapsed = (time.time() - start)
print "sort function takes " + str(elapsed)
start = time.time()
vec_sort(test_data)
elapsed = (time.time() - start)
print "sort function takes " + str(elapsed)
start = time.time()
vec_sort(test_data)
elapsed = (time.time() - start)
print "vec_sort function takes " + str(elapsed)
```

I have tested from array size 100 to 10,000,000 in my macbook(2GHz Intel Core 2 Duo 2GB SDRAM), the result is as following:

- test_data size is: 1000
- sort function takes 0.000274896621704
vec_sort function takes 0.00383687019348

test_data size is: 10000

- sort function takes 0.00380706787109
vec_sort function takes 0.0371489524841

test_data size is: 100000

- sort function takes 0.0520560741425
vec_sort function takes 0.374383926392

test_data size is: 1000000

- sort function takes 0.867373943329
vec_sort function takes 3.80475401878

test_data size is: 10000000

- sort function takes 12.9204008579
- vec_sort function takes 38.8053860664

What disappoints me is that even when the test_data size is 100,000,000, the sort function is still faster than vec_sort. Is there any way to accelerate the vec_sort function?

© Stack Overflow or respective owner