Search Results

Search found 1 results on 1 pages for 'vkazanov'.

Page 1/1 | 1 

  • Unexpected performance curve from CPython merge sort

    - by vkazanov
    I have implemented a naive merge sorting algorithm in Python. Algorithm and test code is below: import time import random import matplotlib.pyplot as plt import math from collections import deque def sort(unsorted): if len(unsorted) <= 1: return unsorted to_merge = deque(deque([elem]) for elem in unsorted) while len(to_merge) > 1: left = to_merge.popleft() right = to_merge.popleft() to_merge.append(merge(left, right)) return to_merge.pop() def merge(left, right): result = deque() while left or right: if left and right: elem = left.popleft() if left[0] > right[0] else right.popleft() elif not left and right: elem = right.popleft() elif not right and left: elem = left.popleft() result.append(elem) return result LOOP_COUNT = 100 START_N = 1 END_N = 1000 def test(fun, test_data): start = time.clock() for _ in xrange(LOOP_COUNT): fun(test_data) return time.clock() - start def run_test(): timings, elem_nums = [], [] test_data = random.sample(xrange(100000), END_N) for i in xrange(START_N, END_N): loop_test_data = test_data[:i] elapsed = test(sort, loop_test_data) timings.append(elapsed) elem_nums.append(len(loop_test_data)) print "%f s --- %d elems" % (elapsed, len(loop_test_data)) plt.plot(elem_nums, timings) plt.show() run_test() As much as I can see everything is OK and I should get a nice N*logN curve as a result. But the picture differs a bit: Things I've tried to investigate the issue: PyPy. The curve is ok. Disabled the GC using the gc module. Wrong guess. Debug output showed that it doesn't even run until the end of the test. Memory profiling using meliae - nothing special or suspicious. ` I had another implementation (a recursive one using the same merge function), it acts the similar way. The more full test cycles I create - the more "jumps" there are in the curve. So how can this behaviour be explained and - hopefully - fixed? UPD: changed lists to collections.deque UPD2: added the full test code UPD3: I use Python 2.7.1 on a Ubuntu 11.04 OS, using a quad-core 2Hz notebook. I tried to turn of most of all other processes: the number of spikes went down but at least one of them was still there.

    Read the article

1