python embedding - Page 158

Python problem with resize animate GIF

- by gigimon

Hello! I'm want to resize animated GIF with save animate. I'm try use PIL and PythonMagickWand (ImageMagick) and with some GIF's get bad frame. When I'm use PIL, it mar frame in read frame. For test, I'm use this code: from PIL import Image im = Image.open('d:/box_opens_closes.gif') im.seek(im.tell()+1) im.seek(im.tell()+1) im.seek(im.tell()+1) im.show() When I'm use MagickWand with this code: wand = NewMagickWand() MagickReadImage(wand, 'd:/Box_opens_closes.gif') MagickSetLastIterator(wand) length = MagickGetIteratorIndex(wand) MagickSetFirstIterator(wand) for i in range(0, length+1): MagickSetIteratorIndex(wand,i) MagickScaleImage(wand, 87, 58) MagickWriteImages(wand, 'path', 1) My GIF where I'm get bad frame this: test gif In GIF editor software, all freme is ok. Where problem? Thx

Read the article

Efficient way in Python to remove an element from a comma-separated string

- by ensnare

I'm looking for the most efficient way to add an element to a comma-separated string while maintaining alphabetical order for the words: For example: string = 'Apples, Bananas, Grapes, Oranges' subtraction = 'Bananas' result = 'Apples, Grapes, Oranges' Also, a way to do this but while maintaining IDs: string = '1:Apples, 4:Bananas, 6:Grapes, 23:Oranges' subtraction = '4:Bananas' result = '1:Apples, 6:Grapes, 23:Oranges' Sample code is greatly appreciated. Thank you so much.

Read the article

[Tkinter/Python] Different line widths with canvas.create_line?

- by Sam

Does anyone have any idea why I get different line widths on the canvas in the following example? from Tkinter import * bigBoxSize = 150 class cFrame(Frame): def __init__(self, master, cwidth=450, cheight=450): Frame.__init__(self, master, relief=RAISED, height=550, width=600, bg = "grey") self.canvasWidth = cwidth self.canvasHeight = cheight self.canvas = Canvas(self, bg="white", width=cwidth, height=cheight, border =0) self.drawGridLines() self.canvas.pack(side=TOP, pady=20, padx=20) def drawGridLines(self, linewidth = 10): self.canvas.create_line(0, 0, self.canvasWidth, 0, width= linewidth ) self.canvas.create_line(0, 0, 0, self.canvasHeight, width= linewidth ) self.canvas.create_line(0, self.canvasHeight, self.canvasWidth + 2, self.canvasHeight, width= linewidth ) self.canvas.create_line(self.canvasWidth, self.canvasHeight, self.canvasWidth, 1, width= linewidth ) self.canvas.create_line(0, bigBoxSize, self.canvasWidth, bigBoxSize, width= linewidth ) self.canvas.create_line(0, bigBoxSize * 2, self.canvasWidth, bigBoxSize * 2, width= linewidth) root = Tk() C = cFrame(root) C.pack() root.mainloop() It's really frustrating me as I have no idea what's happening. If anyone can help me out then that'd be fantastic. Thanks!

Read the article

Python - werid behavior

- by orokusaki

I've done what I shouldn't have done and written 4 modules (6 hours or so) without running any tests along the way. I have a method inside of /mydir/__init__.py called get_hash(), and a class inside of /mydir/utils.py called SpamClass. /mydir/utils.py imports get_hash() from /mydir/__init__. /mydir/__init__.py imports SpamClass from /mydir/utils.py. Both the class and the method work fine on their own but for some reason if I try to import /mydir/, I get an import error saying "Cannot import name get_hash" from /mydir/__init__.py. The only stack trace is the line saying that __init__.py imported SpamClass. The next line is where the error occurs in in SpamClass when trying to import get_hash. Why is this?

Read the article

Python and classes

- by Artyom

Hello, i have 2 classes. How i call first.TQ in Second ? Without creating object First in Second. class First: def __init__(self): self.str = "" def TQ(self): pass def main(self): T = Second(self.str) # Called here class Second(): def __init__(self): list = {u"RANDINT":first.TQ} # List of funcs maybe called in first ..... ..... return data

Read the article

Python unicode search not giving correct answer

- by user1318912

I am trying to search hindi words contained one line per file in file-1 and find them in lines in file-2. I have to print the line numbers with the number of words found. This is the code: import codecs hypernyms = codecs.open("hindi_hypernym.txt", "r", "utf-8").readlines() words = codecs.open("hypernyms_en2hi.txt", "r", "utf-8").readlines() count_arr = [] for counter, line in enumerate(hypernyms): count_arr.append(0) for word in words: if line.find(word) >=0: count_arr[counter] +=1 for iterator, count in enumerate(count_arr): if count>0: print iterator, ' ', count This is finding some words, but ignoring some others The input files are: File-1: ???? ??????? File-2: ???????, ????-???? ?????-???, ?????-???, ?????_???, ?????_??? ????_????, ????-????, ???????_???? ????-???? This gives output: 0 1 3 1 Clearly, it is ignoring ??????? and searching for ???? only. I have tried with other inputs as well. It only searches for one word. Any idea how to correct this?

Read the article

Dynamic variable name in python

- by PhilGo20

I'd like to call a query with a field name filter that I wont know before run time... Not sure how to construct the variable name ...Or maybe I am tired. field_name = funct() locations = Locations.objects.filter(field_name__lte=arg1) where if funct() returns name would equal to locations = Locations.objects.filter(name__lte=arg1) Not sure how to do that ...

Read the article

tkinter python entry not being displayed

- by user1050619

I have created a Form with labels and entries..but for some reason the entries are not being created, peoplegui.py from tkinter import * from tkinter.messagebox import showerror import shelve shelvename = 'class-shelve' fieldnames = ('name','age','job','pay') def makewidgets(): global entries window = Tk() window.title('People Shelve') form = Frame(window) form.pack() entries = {} for (ix, label) in enumerate(('key',) + fieldnames): lab = Label(form, text=label) ent = Entry(form) lab.grid(row=ix, column=0) lab.grid(row=ix, column=1) entries[label] = ent Button(window, text="Fetch", command=fetchRecord).pack(side=LEFT) Button(window, text="Update", command=updateRecord).pack(side=LEFT) Button(window, text="Quit", command=window.quit).pack(side=RIGHT) return window def fetchRecord(): print('In fetch') def updateRecord(): print('In update') if __name__ == '__main__': window = makewidgets() window.mainloop() When I run it the labels are created but not the entries.

Read the article

Regular expressions in python unicode

- by Remy

I need to remove all the html tags from a given webpage data. I tried this using regular expressions: import urllib2 import re page = urllib2.urlopen("http://www.frugalrules.com") from bs4 import BeautifulSoup, NavigableString, Comment soup = BeautifulSoup(page) link = soup.find('link', type='application/rss+xml') print link['href'] rss = urllib2.urlopen(link['href']).read() souprss = BeautifulSoup(rss) description_tag = souprss.find_all('description') content_tag = souprss.find_all('content:encoded') print re.sub('<[^>]*>', '', content_tag) But the syntax of the re.sub is: re.sub(pattern, repl, string, count=0) So, I modified the code as (instead of the print statement above): for row in content_tag: print re.sub(ur"<[^>]*>",'',row,re.UNICODE But it gives the following error: Traceback (most recent call last): File "C:\beautifulsoup4-4.3.2\collocation.py", line 20, in <module> print re.sub(ur"<[^>]*>",'',row,re.UNICODE) File "C:\Python27\lib\re.py", line 151, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or buffer What am I doing wrong?

Read the article

Replacing python docstrings

- by tomaz

I have written a epytext to reST markup converter, and now I want to convert all the docstrings in my entire library from epytext to reST format. Is there a smart way to read the all the docstrings in a module and write back the replacements? ps: ast module perhaps?

Read the article

How to read pdf, ppt, xl, doc files content into a string in php/python

- by harsha

Pls suggest me any inbuilt command or package?

Read the article

Python - pickling fails for numpy.void objects

- by I82Much

>>> idmapfile = open("idmap", mode="w") >>> pickle.dump(idMap, idmapfile) >>> idmapfile.close() >>> idmapfile = open("idmap") >>> unpickled = pickle.load(idmapfile) >>> unpickled == idMap False idMap[1] {1537: (552, 1, 1537, 17.793827056884766, 3), 1540: (4220, 1, 1540, 19.31205940246582, 3), 1544: (592, 1, 1544, 18.129131317138672, 3), 1675: (529, 1, 1675, 18.347782135009766, 3), 1550: (4048, 1, 1550, 19.31205940246582, 3), 1424: (1528, 1, 1424, 19.744396209716797, 3), 1681: (1265, 1, 1681, 19.596025466918945, 3), 1560: (3457, 1, 1560, 20.530569076538086, 3), 1690: (477, 1, 1690, 17.395542144775391, 3), 1691: (554, 1, 1691, 13.446117401123047, 3), 1436: (3010, 1, 1436, 19.596025466918945, 3), 1434: (3183, 1, 1434, 19.744396209716797, 3), 1441: (3570, 1, 1441, 20.589576721191406, 3), 1435: (476, 1, 1435, 19.640911102294922, 3), 1444: (527, 1, 1444, 17.98480224609375, 3), 1478: (1897, 1, 1478, 19.596025466918945, 3), 1575: (614, 1, 1575, 19.371648788452148, 3), 1586: (2189, 1, 1586, 19.31205940246582, 3), 1716: (3470, 1, 1716, 19.158674240112305, 3), 1590: (2278, 1, 1590, 19.596025466918945, 3), 1463: (991, 1, 1463, 19.31205940246582, 3), 1594: (1890, 1, 1594, 19.596025466918945, 3), 1467: (1087, 1, 1467, 19.31205940246582, 3), 1596: (3759, 1, 1596, 19.744396209716797, 3), 1602: (3011, 1, 1602, 20.530569076538086, 3), 1547: (490, 1, 1547, 17.994071960449219, 3), 1605: (658, 1, 1605, 19.31205940246582, 3), 1606: (1794, 1, 1606, 16.964881896972656, 3), 1719: (1826, 1, 1719, 19.596025466918945, 3), 1617: (583, 1, 1617, 11.894925117492676, 3), 1492: (3441, 1, 1492, 20.500667572021484, 3), 1622: (3215, 1, 1622, 19.31205940246582, 3), 1628: (2761, 1, 1628, 19.744396209716797, 3), 1502: (1563, 1, 1502, 19.596025466918945, 3), 1632: (1108, 1, 1632, 15.457141876220703, 3), 1468: (3779, 1, 1468, 19.596025466918945, 3), 1642: (3970, 1, 1642, 19.744396209716797, 3), 1518: (612, 1, 1518, 18.570245742797852, 3), 1647: (854, 1, 1647, 16.964881896972656, 3), 1650: (2099, 1, 1650, 20.439058303833008, 3), 1651: (540, 1, 1651, 18.552841186523438, 3), 1653: (613, 1, 1653, 19.237197875976563, 3), 1532: (537, 1, 1532, 18.885730743408203, 3)} >>> unpickled[1] {1537: (64880, 1638, 56700, -1.0808743559293829e+18, 152), 1540: (64904, 1638, 0, 0.0, 0), 1544: (54472, 1490, 0, 0.0, 0), 1675: (6464, 1509, 0, 0.0, 0), 1550: (43592, 1510, 0, 0.0, 0), 1424: (43616, 1510, 0, 0.0, 0), 1681: (0, 0, 0, 0.0, 0), 1560: (400, 152, 400, 2.1299736657737219e-43, 0), 1690: (408, 152, 408, 2.7201111331839077e+26, 34), 1435: (424, 152, 61512, 1.0122952080313192e-39, 0), 1436: (400, 152, 400, 20.250289916992188, 3), 1434: (424, 152, 62080, 1.0122952080313192e-39, 0), 1441: (400, 152, 400, 12.250144958496094, 3), 1691: (424, 152, 42608, 15.813941955566406, 3), 1444: (400, 152, 400, 19.625289916992187, 3), 1606: (424, 152, 42432, 5.2947192852601414e-22, 41), 1575: (400, 152, 400, 6.2537390010262572e-36, 0), 1586: (424, 152, 42488, 1.0122601755697111e-39, 0), 1716: (400, 152, 400, 6.2537390010262572e-36, 0), 1590: (424, 152, 64144, 1.0126357235581501e-39, 0), 1463: (400, 152, 400, 6.2537390010262572e-36, 0), 1594: (424, 152, 32672, 17.002994537353516, 3), 1467: (400, 152, 400, 19.750289916992187, 3), 1596: (424, 152, 7176, 1.0124003054161436e-39, 0), 1602: (400, 152, 400, 18.500289916992188, 3), 1547: (424, 152, 7000, 1.0124003054161436e-39, 0), 1605: (400, 152, 400, 20.500289916992188, 3), 1478: (424, 152, 42256, -6.0222748507426518e+30, 222), 1719: (400, 152, 400, 6.2537390010262572e-36, 0), 1617: (424, 152, 16472, 1.0124283313854301e-39, 0), 1492: (400, 152, 400, 6.2537390010262572e-36, 0), 1622: (424, 152, 35304, 1.0123190301052127e-39, 0), 1628: (400, 152, 400, 6.2537390010262572e-36, 0), 1502: (424, 152, 63152, 19.627988815307617, 3), 1632: (400, 152, 400, 19.375289916992188, 3), 1468: (424, 152, 38088, 1.0124213248931084e-39, 0), 1642: (400, 152, 400, 6.2537390010262572e-36, 0), 1518: (424, 152, 63896, 1.0127436235399031e-39, 0), 1647: (400, 152, 400, 6.2537390010262572e-36, 0), 1650: (424, 152, 53424, 16.752857208251953, 3), 1651: (400, 152, 400, 19.250289916992188, 3), 1653: (424, 152, 50624, 1.0126497365427934e-39, 0), 1532: (400, 152, 400, 6.2537390010262572e-36, 0)} The keys come out fine, the values are screwed up. I tried same thing loading file in binary mode; didn't fix the problem. Any idea what I'm doing wrong? Edit: Here's the code with binary. Note that the values are different in the unpickled object. >>> idmapfile = open("idmap", mode="wb") >>> pickle.dump(idMap, idmapfile) >>> idmapfile.close() >>> idmapfile = open("idmap", mode="rb") >>> unpickled = pickle.load(idmapfile) >>> unpickled==idMap False >>> unpickled[1] {1537: (12176, 2281, 56700, -1.0808743559293829e+18, 152), 1540: (0, 0, 15934, 2.7457842047810522e+26, 108), 1544: (400, 152, 400, 4.9518498821046956e+27, 53), 1675: (408, 152, 408, 2.7201111331839077e+26, 34), 1550: (456, 152, 456, -1.1349175514578289e+18, 152), 1424: (432, 152, 432, 4.5939047815653343e-40, 11), 1681: (408, 152, 408, 2.1299736657737219e-43, 0), 1560: (376, 152, 376, 2.1299736657737219e-43, 0), 1690: (376, 152, 376, 2.1299736657737219e-43, 0), 1435: (376, 152, 376, 2.1299736657737219e-43, 0), 1436: (376, 152, 376, 2.1299736657737219e-43, 0), 1434: (376, 152, 376, 2.1299736657737219e-43, 0), 1441: (376, 152, 376, 2.1299736657737219e-43, 0), 1691: (376, 152, 376, 2.1299736657737219e-43, 0), 1444: (376, 152, 376, 2.1299736657737219e-43, 0), 1606: (25784, 2281, 376, -3.2883343074537754e+26, 34), 1575: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1586: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1716: (24240, 2281, 376, -3.0093091599657311e-35, 26), 1590: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1463: (24240, 2281, 376, 2.1299736657737219e-43, 0), 1594: (24240, 2281, 376, -4123208450048.0, 196), 1467: (25784, 2281, 376, 2.1299736657737219e-43, 0), 1596: (25784, 2281, 376, 2.1299736657737219e-43, 0), 1602: (25784, 2281, 376, -5.9963281433905448e+26, 76), 1547: (25784, 2281, 376, -218106240.0, 139), 1605: (25784, 2281, 376, -3.7138649803377281e+27, 56), 1478: (376, 152, 376, 2.1299736657737219e-43, 0), 1719: (25784, 2281, 376, 2.1299736657737219e-43, 0), 1617: (25784, 2281, 376, -1.4411779941597184e+17, 237), 1492: (25784, 2281, 376, 2.8596493694487798e-30, 80), 1622: (25784, 2281, 376, 184686084096.0, 93), 1628: (1336, 152, 1336, 3.1691839245470052e+29, 179), 1502: (1272, 152, 1272, -5.2042207205116645e-17, 99), 1632: (1208, 152, 1208, 2.1299736657737219e-43, 0), 1468: (1144, 152, 1144, 2.1299736657737219e-43, 0), 1642: (1080, 152, 1080, 2.1299736657737219e-43, 0), 1518: (1016, 152, 1016, 4.0240902787680023e+35, 145), 1647: (952, 152, 952, -985172619034624.0, 237), 1650: (888, 152, 888, 12094787289088.0, 66), 1651: (824, 152, 824, 2.1299736657737219e-43, 0), 1653: (760, 152, 760, 0.00018310768064111471, 238), 1532: (696, 152, 696, 8.8978061885676389e+26, 125)} OK I've isolated the problem, but don't know why it's so. First, apparently what I'm pickling are not tuples (though they look like it), but instead numpy.void types. Here is a series to illustrate the problem. first = run0.detections[0] >>> first (1, 19, 1578, 82.637763977050781, 1) >>> type(first) <type 'numpy.void'> >>> firstTuple = tuple(first) >>> theFile = open("pickleTest", "w") >>> pickle.dump(first, theFile) >>> theTupleFile = open("pickleTupleTest", "w") >>> pickle.dump(firstTuple, theTupleFile) >>> theFile.close() >>> theTupleFile.close() >>> first (1, 19, 1578, 82.637763977050781, 1) >>> firstTuple (1, 19, 1578, 82.637764, 1) >>> theFile = open("pickleTest", "r") >>> theTupleFile = open("pickleTupleTest", "r") >>> unpickledTuple = pickle.load(theTupleFile) >>> unpickledVoid = pickle.load(theFile) >>> type(unpickledVoid) <type 'numpy.void'> >>> type(unpickledTuple) <type 'tuple'> >>> unpickledTuple (1, 19, 1578, 82.637764, 1) >>> unpickledTuple == firstTuple True >>> unpickledVoid == first False >>> unpickledVoid (7936, 1705, 56700, -1.0808743559293829e+18, 152) >>> first (1, 19, 1578, 82.637763977050781, 1)

Read the article

python cairoplot store previous readings..

- by krisdigitx

hi, i am using cairoplot, to make graphs, however the file from where i am reading the data is growing huge and its taking a long time to process the graph is there any real-time way to produce cairo graph, or at least store the previous readings..like rrd. -krisdigitx

Read the article

Exposing boost::scoped_ptr in boost::python

- by Rupert Jones

Hello, I am getting a compile error, saying that the copy constructor of the scoped_ptr is private with the following code snippet: class a {}; struct s { boost::scoped_ptr<a> p; }; BOOST_PYTHON_MODULE( module ) { class_<s>( "s" ); } This example works with a shared_ptr though. It would be nice, if anyone knows the answer. Thanks

Read the article

Check if command had some trouble in Python?

- by Shady

I have this command h = urllib.urlopen("http://google.com/") How can I check if it retrieves me some error, internet down or something I don't expect? For example, something like if error print 'ERROR' Thank you

Read the article

Python, add trailing slash to directory string, os independently

- by Horace Ho

How can I add a trailing slash (/ for *nix, \ for win32) to a directory string, if the tailing slash is not already there? Thanks!

Read the article

Python: Unpack arbitary length bits for database storage

- by sberry2A

I have a binary data format consisting of 18,000+ packed int64s, ints, shorts, bytes and chars. The data is packed to minimize it's size, so they don't always use byte sized chunks. For example, a number whose min and max value are 31, 32 respectively might be stored with a single bit where the actual value is bitvalue + min, so 0 is 31 and 1 is 32. I am looking for the most efficient way to unpack all of these for subsequent processing and database storage. Right now I am able to read any value by using either struct.unpack, or BitBuffer. I use struct.unpack for any data that starts on a bit where (bit-offset % 8 == 0 and data-length % 8 == 0) and I use BitBuffer for anything else. I know the offset and size of every packed piece of data, so what is going to be the fasted way to completely unpack them? Many thanks.

Read the article

varargs in lambda functions in Python

- by brain_damage

Is it possible a lambda function to have variable number of arguments? For example, I want to write a metaclass, which creates a method for every method of some other class and this newly created method returns the opposite value of the original method and has the same number of arguments. And I want to do this with lambda function. How to pass the arguments? Is it possible? class Negate(type): def __new__(mcs, name, bases, _dict): extended_dict = _dict.copy() for (k, v) in _dict.items(): if hasattr(v, '__call__'): extended_dict["not_" + k] = lambda s, *args, **kw: not v(s, *args, **kw) return type.__new__(mcs, name, bases, extended_dict) class P(metaclass=Negate): def __init__(self, a): self.a = a def yes(self): return True def maybe(self, you_can_chose): return you_can_chose But the result is totally wrong: >>>p = P(0) >>>p.yes() True >>>p.not_yes() # should be False Traceback (most recent call last): File "<pyshell#150>", line 1, in <module> p.not_yes() File "C:\Users\Nona\Desktop\p10.py", line 51, in <lambda> extended_dict["not_" + k] = lambda s, *args, **kw: not v(s, *args, **kw) TypeError: __init__() takes exactly 2 positional arguments (1 given) >>>p.maybe(True) True >>>p.not_maybe(True) #should be False True

Read the article

Python RegExp exception

- by Jasie

How do I split on all nonalphanumeric characters, EXCEPT the apostrophe? re.split('\W+',text) works, but will also split on apostrophes. How do I add an exception to this rule? Thanks!

Read the article

Python string formatting too slow

- by wich

I use the following code to log a map, it is fast when it only contains zeroes, but as soon as there is actual data in the map it becomes unbearably slow... Is there any way to do this faster? log_file = open('testfile', 'w') for i, x in ((i, start + i * interval) for i in range(length)): log_file.write('%-5d %8.3f %13g %13g %13g %13g %13g %13g\n' % (i, x, map[0][i], map[1][i], map[2][i], map[3][i], map[4][i], map[5][i]))

Read the article

indexing error in list in python

- by mekasperasky

B=l.append((l[i]+A+B)) l is a list here and i am trying to append into it more value for it to act as an array . But its still giving me error like list index out of range . How to get rid of it ?

Read the article

strip spaces in python.

- by Richard

ok I know that this should be simple... anyways say: line = "$W5M5A,100527,142500,730301c44892fd1c,2,686.5 4,333.96,0,0,28.6,123,75,-0.4,1.4*49" I want to strip out the spaces. I thought you would just do this line = line.strip() but now line is still '$W5M5A,100527,142500,730301c44892fd1c,2,686.5 4,333.96,0,0,28.6,123,75,-0.4,1.4*49' instead of '$W5M5A,100527,142500,730301c44892fd1c,2,686.54,333.96,0,0,28.6,123,75,-0.4,1.4*49' any thoughts?

Read the article

strip tags python

- by user283405

i want the following functionality. input : this is test bold text normal text expected output: this is test normal text the tags can be changed.

Read the article

Dynamic Operator Overloading on dict classes in Python

- by Ishpeck

I have a class that dynamically overloads basic arithmetic operators like so... import operator class IshyNum: def __init__(self, n): self.num=n self.buildArith() def arithmetic(self, other, o): return o(self.num, other) def buildArith(self): map(lambda o: setattr(self, "__%s__"%o,lambda f: self.arithmetic(f, getattr(operator, o))), ["add", "sub", "mul", "div"]) if __name__=="__main__": number=IshyNum(5) print number+5 print number/2 print number*3 print number-3 But if I change the class to inherit from the dictionary (class IshyNum(dict):) it doesn't work. I need to explicitly def __add__(self, other) or whatever in order for this to work. Why?

Read the article

Efficient file buffering & scanning methods for large files in python

- by eblume

The description of the problem I am having is a bit complicated, and I will err on the side of providing more complete information. For the impatient, here is the briefest way I can summarize it: What is the fastest (least execution time) way to split a text file in to ALL (overlapping) substrings of size N (bound N, eg 36) while throwing out newline characters. I am writing a module which parses files in the FASTA ascii-based genome format. These files comprise what is known as the 'hg18' human reference genome, which you can download from the UCSC genome browser (go slugs!) if you like. As you will notice, the genome files are composed of chr[1..22].fa and chr[XY].fa, as well as a set of other small files which are not used in this module. Several modules already exist for parsing FASTA files, such as BioPython's SeqIO. (Sorry, I'd post a link, but I don't have the points to do so yet.) Unfortunately, every module I've been able to find doesn't do the specific operation I am trying to do. My module needs to split the genome data ('CAGTACGTCAGACTATACGGAGCTA' could be a line, for instance) in to every single overlapping N-length substring. Let me give an example using a very small file (the actual chromosome files are between 355 and 20 million characters long) and N=8 import cStringIO example_file = cStringIO.StringIO("""\ header CAGTcag TFgcACF """) for read in parse(example_file): ... print read ... CAGTCAGTF AGTCAGTFG GTCAGTFGC TCAGTFGCA CAGTFGCAC AGTFGCACF The function that I found had the absolute best performance from the methods I could think of is this: def parse(file): size = 8 # of course in my code this is a function argument file.readline() # skip past the header buffer = '' for line in file: buffer += line.rstrip().upper() while len(buffer) = size: yield buffer[:size] buffer = buffer[1:] This works, but unfortunately it still takes about 1.5 hours (see note below) to parse the human genome this way. Perhaps this is the very best I am going to see with this method (a complete code refactor might be in order, but I'd like to avoid it as this approach has some very specific advantages in other areas of the code), but I thought I would turn this over to the community. Thanks! Note, this time includes a lot of extra calculation, such as computing the opposing strand read and doing hashtable lookups on a hash of approximately 5G in size. Post-answer conclusion: It turns out that using fileobj.read() and then manipulating the resulting string (string.replace(), etc.) took relatively little time and memory compared to the remainder of the program, and so I used that approach. Thanks everyone!

Search Results

Search found 14007 results on 561 pages for 'python embedding'.

Page 158/561 | < Previous Page | 154 155 156 157 158 159 160 161 162 163 164 165 | Next Page >

- by gigimon

- by ensnare

- by Sam

- by orokusaki

- by Artyom

- by user1318912

- by PhilGo20

- by user1050619

- by Remy

- by tomaz

- by harsha

- by I82Much

- by krisdigitx

- by Rupert Jones

- by Shady

- by Horace Ho

- by sberry2A

- by brain_damage

- by Jasie

- by wich

- by mekasperasky

- by Richard

- by user283405

- by Ishpeck

- by eblume

< Previous Page | 154 155 156 157 158 159 160 161 162 163 164 165 | Next Page >