Search Results

Search found 15224 results on 609 pages for 'parallel python'.

Page 164/609 | < Previous Page | 160 161 162 163 164 165 166 167 168 169 170 171  | Next Page >

  • problem with hierarchical clustering in Python

    - by user248237
    I am doing a hierarchical clustering a 2 dimensional matrix by correlation distance metric (i.e. 1 - Pearson correlation). My code is the following (the data is in a variable called "data"): from hcluster import * Y = pdist(data, 'correlation') cluster_type = 'average' Z = linkage(Y, cluster_type) dendrogram(Z) The error I get is: ValueError: Linkage 'Z' contains negative distances. What causes this error? The matrix "data" that I use is simply: [[ 156.651968 2345.168618] [ 158.089968 2032.840106] [ 207.996413 2786.779081] [ 151.885804 2286.70533 ] [ 154.33665 1967.74431 ] [ 150.060182 1931.991169] [ 133.800787 1978.539644] [ 112.743217 1478.903191] [ 125.388905 1422.3247 ]] I don't see how pdist could ever produce negative numbers when taking 1 - pearson correlation. Any ideas on this? thank you.

    Read the article

  • searching a list of tuples in python

    - by hdx
    So I have a list of tuple like: [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")] I want this list for a tuple whose number value is equal to something. So that if I do search(53) it will return 2 Is is an easy way to do that?

    Read the article

  • XML - python prints extra lines

    - by horse
    `from xml import xpath from xml.dom import minidom xmldata = minidom.parse('model.xml').documentElement for maks in xpath.Evaluate('/cacti/results/maks/text()', xmldata): print maks.nodeValue ` and I get result: 85603399.14 398673062.66 95785523.81 But I needed to be: 85603399.14 NO SPACE 398673062.66 NO SPACE 95785523.81 Can somebody help me, i new at programing :( ?

    Read the article

  • python list directory , subdirectory and files

    - by thomytheyon
    Hi Alls, I'm trying to make a script to list all directory, subdirectory, and files in a given directory. I tried this : import sys,os root="/home/patate/directory/" path=os.path.join(root,"targetdirectory") for r,d,f in os.walk(path): for file in f: print os.path.join(root,file) Unfortunatly it doesn't work properly, i get all the files, but not them complete path. like : if the dir struct would be : /home/patate/directory/targetdirectory/123/456/789/file.txt It would print: /home/patate/directory/targetdirectory/file.txt What i need is the first result, any help would be greatly appreciated ! Thanks

    Read the article

  • How to remove commas etc from a matrix in python

    - by robert
    say ive got a matrix that looks like: [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] how can i make it on seperate lines: [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] and then remove commas etc: 0 0 0 0 0 And also to make it blank instead of 0's, so that numbers can be put in later, so in the end it will be like: _ 1 2 _ 1 _ 1 (spaces not underscores) thanks

    Read the article

  • Python meta programming question

    - by orokusaki
    I have a meta class MyClass which adds an attribute to a class based on some of the properties of the class's methods. When I subclass MyClass I want it to still have that attribute, and append to the attribute's value based on the sub-class's methods (ie, sub-classing extends the same attribute that the base's meta creates.). Can this be done via the bases argument passed to __new__(cls, name, bases, dct)?

    Read the article

  • Distributed and/or Parallel SSIS processing

    - by Jeff
    Background: Our company hosts SaaS DSS applications, where clients provide us data Daily and/or Weekly, which we process & merge into their existing database. During business hours, load in the servers are pretty minimal as it's mostly users running simple pre-defined queries via the website, or running drill-through reports that mostly hit the SSAS OLAP cube. I manage the IT Operations Team, and so far this has presented an interesting "scaling" issue for us. For our daily-refreshed clients, the server is only "busy" for about 4-6 hrs at night. For our weekly-refresh clients, the server is only "busy" for maybe 8-10 hrs per week! We've done our best to use some simple methods of distributing the load by spreading the daily clients evenly among the servers such that we're not trying to process daily clients back-to-back over night. But long-term this scaling strategy creates two notable issues. First, it's going to consume a pretty immense amount of hardware that sits idle for large periods of time. Second, it takes significant Production Support over-head to basically "schedule" the ETL such that they don't over-lap, and move clients/schedules around if they out-grow the resources on a particular server or allocated time-slot. As the title would imply, one option we've tried is running multiple SSIS packages in parallel, but in most cases this has yielded VERY inconsistent results. The most common failures are DTExec, SQL, and SSAS fighting for physical memory and throwing out-of-memory errors, and ETLs running 3,4,5x longer than expected. So from my practical experience thus far, it seems like running multiple ETL packages on the same hardware isn't a good idea, but I can't be the first person that doesn't want to scale multiple ETLs around manual scheduling, and sequential processing. One option we've considered is virtualizing the servers, which obviously doesn't give you any additional resources, but moves the resource contention onto the hypervisor, which (from my experience) seems to manage simultaneous CPU/RAM/Disk I/O a little more gracefully than letting DTExec, SQL, and SSAS battle it out within Windows. Question to the forum: So my question to the forum is, are we missing something obvious here? Are there tools out there that can help manage running multiple SSIS packages on the same hardware? Would it be more "efficient" in terms of parallel execution if instead of running DTExec, SQL, and SSAS same machine (with every machine running that configuration), we run in pairs of three machines with SSIS running on one machine, SQL on another, and SSAS on a third? Obviously that would only make sense if we could process more than the three ETL we were able to process on the machine independently. Another option we've considered is completely re-architecting our SSIS package to have one "master" package for all clients that attempts to intelligently chose a server based off how "busy" it already is in terms of CPU/Memory/Disk utilization, but that would be a herculean effort, and seems like we're trying to reinvent something that you would think someone would sell (although I haven't had any luck finding it). So in summary, are we missing an obvious solution for this, and does anyone know if any tools (for free or for purchase, doesn't matter) that facilitate running multiple SSIS ETL packages in parallel and on multiple servers? (What I would call a "queue & node based" system, but that's not an official term). Ultimately VMWare's Distributed Resource Scheduler addresses this as you simply run a consistent number of clients per VM that you know will never conflict scheduleing-wise, then leave it up to VMWare to move the VMs around to balance out hardware usage. I'm definitely not against using VMWare to do this, but since we're a 100% Microsoft app stack, it seems like -someone- out there would have solved this problem at the application layer instead of the hypervisor layer by checking on resource utilization at the OS, SQL, SSAS levels. I'm open to ANY discussion on this, and remember no suggestion is too crazy or radical! :-) Right now, VMWare is the only option we've found to get away from "manually" balancing our resources, so any suggestions that leave us on a pure Microsoft stack would be great. Thanks guys, Jeff

    Read the article

  • Python ZSI : error while serializing an object ?

    - by KaluSingh Gabbar
    this is the code, I get error that it can not serialize reference (sumReq) sumReqClass = GED("http://www.some-service.com/sample", "getSumRequest").pyclass sumReq = sumReqClass() rq = GetSumSoapIn() sum._sumReqObj = sumReq rs=proxy.GetSum(rq, soapheaders=[credentials]) I get error : TypeError: bad usage, failed to serialize element reference (http://www.some-service.com/sample, getSumRequest), in: /SOAP-ENV:Body

    Read the article

  • what does "from MODULE import _" do in python?

    - by Paul
    Hi all, In the Getting things gnome code base I stumbled upon this import statement from GTG import _ and have no idea what it means, never seen this in the documentation and a quick so / google search didn't turn anything up. Thank you all in advance Paul

    Read the article

  • Exporting dates properly formatted on Google Appengine in Python

    - by Chris M
    I think this is right but google appengine seems to get to a certain point and cop-out; Firstly is this code actually right; and secondly is there away to skip the record if it cant output (like an ignore errors and continue)? class TrackerExporter(bulkloader.Exporter): def __init__(self): bulkloader.Exporter.__init__(self, 'SearchRec', [('__key__', lambda key:key.name(), None), ('WebSite', str, None), ('DateStamp', lambda x: datetime.datetime.strptime(x, '%d-%m-%Y').date(), None), ('IP', str, None), ('UserAgent', str, None)]) Thanks

    Read the article

  • Generating Mouse-Keyboard combination events in python

    - by freakazo
    I want to be able to do a combination of keypresses and mouseclicks simultaneously, as in for example Control+LeftClick At the moment I am able to do Control and then a left click with the following code: import win32com, win32api, win32con def CopyBox( x, y): time.sleep(.2) wsh = win32com.client.Dispatch("WScript.Shell") wsh.SendKeys("^") win32api.SetCursorPos((x,y)) win32api.mouse_event(win32con.MOUSEEVENTF_LEFTDOWN, x, y, 0, 0) win32api.mouse_event(win32con.MOUSEEVENTF_LEFTUP, x, y, 0, 0) What this does is press control on the keyboard, then it clicks. I need it to keep the controll pressed longer and return while it's still pressed to continue running the code. Is there a maybe lower level way of saying press the key and then later in the code tell it to lift up the key such as like what the mouse is doing?

    Read the article

  • Optimizing python code performance when importing zipped csv to a mongo collection

    - by mark
    I need to import a zipped csv into a mongo collection, but there is a catch - every record contains a timestamp in Pacific Time, which must be converted to the local time corresponding to the (longitude,latitude) pair found in the same record. The code looks like so: def read_csv_zip(path, timezones): with ZipFile(path) as z, z.open(z.namelist()[0]) as input: csv_rows = csv.reader(input) header = csv_rows.next() check,converters = get_aux_stuff(header) for csv_row in csv_rows: if check(csv_row): row = { converter[0]:converter[1](value) for converter, value in zip(converters, csv_row) if allow_field(converter) } ts = row['ts'] lng, lat = row['loc'] found_tz_entry = timezones.find_one(SON({'loc': {'$within': {'$box': [[lng-tz_lookup_radius, lat-tz_lookup_radius],[lng+tz_lookup_radius, lat+tz_lookup_radius]]}}})) if found_tz_entry: tz_name = found_tz_entry['tz'] local_ts = ts.astimezone(timezone(tz_name)).replace(tzinfo=None) row['tz'] = tz_name else: local_ts = (ts.astimezone(utc) + timedelta(hours = int(lng/15))).replace(tzinfo = None) row['local_ts'] = local_ts yield row def insert_documents(collection, source, batch_size): while True: items = list(itertools.islice(source, batch_size)) if len(items) == 0: break; try: collection.insert(items) except: for item in items: try: collection.insert(item) except Exception as exc: print("Failed to insert record {0} - {1}".format(item['_id'], exc)) def main(zip_path): with Connection() as connection: data = connection.mydb.data timezones = connection.timezones.data insert_documents(data, read_csv_zip(zip_path, timezones), 1000) The code proceeds as follows: Every record read from the csv is checked and converted to a dictionary, where some fields may be skipped, some titles be renamed (from those appearing in the csv header), some values may be converted (to datetime, to integers, to floats. etc ...) For each record read from the csv, a lookup is made into the timezones collection to map the record location to the respective time zone. If the mapping is successful - that timezone is used to convert the record timestamp (pacific time) to the respective local timestamp. If no mapping is found - a rough approximation is calculated. The timezones collection is appropriately indexed, of course - calling explain() confirms it. The process is slow. Naturally, having to query the timezones collection for every record kills the performance. I am looking for advises on how to improve it. Thanks. EDIT The timezones collection contains 8176040 records, each containing four values: > db.data.findOne() { "_id" : 3038814, "loc" : [ 1.48333, 42.5 ], "tz" : "Europe/Andorra" } EDIT2 OK, I have compiled a release build of http://toblerity.github.com/rtree/ and configured the rtree package. Then I have created an rtree dat/idx pair of files corresponding to my timezones collection. So, instead of calling collection.find_one I call index.intersection. Surprisingly, not only there is no improvement, but it works even more slowly now! May be rtree could be fine tuned to load the entire dat/idx pair into RAM (704M), but I do not know how to do it. Until then, it is not an alternative. In general, I think the solution should involve parallelization of the task. EDIT3 Profile output when using collection.find_one: >>> p.sort_stats('cumulative').print_stats(10) Tue Apr 10 14:28:39 2012 ImportDataIntoMongo.profile 64549590 function calls (64549180 primitive calls) in 1231.257 seconds Ordered by: cumulative time List reduced from 730 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.012 0.012 1231.257 1231.257 ImportDataIntoMongo.py:1(<module>) 1 0.001 0.001 1230.959 1230.959 ImportDataIntoMongo.py:187(main) 1 853.558 853.558 853.558 853.558 {raw_input} 1 0.598 0.598 370.510 370.510 ImportDataIntoMongo.py:165(insert_documents) 343407 9.965 0.000 359.034 0.001 ImportDataIntoMongo.py:137(read_csv_zip) 343408 2.927 0.000 287.035 0.001 c:\python27\lib\site-packages\pymongo\collection.py:489(find_one) 343408 1.842 0.000 274.803 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:699(next) 343408 2.542 0.000 271.212 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:644(_refresh) 343408 4.512 0.000 253.673 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:605(__send_message) 343408 0.971 0.000 242.078 0.001 c:\python27\lib\site-packages\pymongo\connection.py:871(_send_message_with_response) Profile output when using index.intersection: >>> p.sort_stats('cumulative').print_stats(10) Wed Apr 11 16:21:31 2012 ImportDataIntoMongo.profile 41542960 function calls (41542536 primitive calls) in 2889.164 seconds Ordered by: cumulative time List reduced from 778 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.028 0.028 2889.164 2889.164 ImportDataIntoMongo.py:1(<module>) 1 0.017 0.017 2888.679 2888.679 ImportDataIntoMongo.py:202(main) 1 2365.526 2365.526 2365.526 2365.526 {raw_input} 1 0.766 0.766 502.817 502.817 ImportDataIntoMongo.py:180(insert_documents) 343407 9.147 0.000 491.433 0.001 ImportDataIntoMongo.py:152(read_csv_zip) 343406 0.571 0.000 391.394 0.001 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:384(intersection) 343406 379.957 0.001 390.824 0.001 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:435(_intersection_obj) 686513 22.616 0.000 38.705 0.000 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:451(_get_objects) 343406 6.134 0.000 33.326 0.000 ImportDataIntoMongo.py:162(<dictcomp>) 346 0.396 0.001 30.665 0.089 c:\python27\lib\site-packages\pymongo\collection.py:240(insert) EDIT4 I have parallelized the code, but the results are still not very encouraging. I am convinced it could be done better. See my own answer to this question for details.

    Read the article

  • Python 2.6 + PIL + Google App Engine issue

    - by mswallace
    I am using OS X 1.6 snow leopard and I successfully got PIL installed. I am able to open terminal and type import Image without any errors. However, When using app engine I get Image error still saying that PIL is not installed. I am wondering if any of you have an thoughts as to how I can resolve this issue. -Matthew

    Read the article

  • Need help with re for matching and getting the value python

    - by laspal
    Hi, Need help regarding re. file = 'file No.WR79050107006 from files' So what I am trying to do is validate if file string contains WR + 11 digit. result = re.match('^(\S| )*(?P<sr>(\d){11})(\S| )*', file) Its validate only 11 digit but not WR before it. How can I do that? Using re after matching how can I get the match value ( WR79050107006) I can do string find index = file.find('file No.') and then get the value of next 13 char. thanks

    Read the article

  • Python: How do sets work

    - by Guy
    I have a list of objects which I want to turn into a set. My objects contain a few fields that some of which are o.id and o.area. I want two objects to be equal if these two fields are the same. ie: o1==o2 if and only if o1.area==o2.area and o1.id==o2.id. I tried over-writing __eq__ and __cmp__ but I get the error: TypeError: unhashable instance. What should I over-write?

    Read the article

  • Problem with dictionary key in Python

    - by Hossein
    Hi all, For some project I have to make a dictionary in which the keys are urls,among which I have this url: http://www.microsoft.com/isapi/redir.dll prd=windows&sbp=mediaplayer&ar=Media&sba=Guide&pver=6.2 the url is too long to fit in here I guess in one single line. there is a space between .dll and prd. I can build a dictionary without any errors this url is also a key. but for some reason when I want to extract the values associated to this key(url). I cannot, I get and error "error key:...." Does someone know what is wrong with this url? Are dictionary keys sensitive to some stuff? thanks

    Read the article

  • Python decorator question

    - by nsharish
    decorator 1: def dec(f): def wrap(obj, *args, **kwargs): f(obj, *args,**kwargs) return wrap decorator 2: class dec: def __init__(self, f): self.f = f def __call__(self, obj, *args, **kwargs): self.f(obj, *args, **kwargs) A sample class, class Test: @dec def disp(self, *args, **kwargs): print(*args,**kwargs) The follwing code works with decorator 1 but not with decorator 2. a = Test() a.disp("Message") I dont understand why decorator 2 is not working here. Can someone help me with this?

    Read the article

  • Python 3.1 - Memory Error during sampling of a large list

    - by jimy
    The input list can be more than 1 million numbers. When I run the following code with smaller 'repeats', its fine; def sample(x): length = 1000000 new_array = random.sample((list(x)),length) return (new_array) def repeat_sample(x): i = 0 repeats = 100 list_of_samples = [] for i in range(repeats): list_of_samples.append(sample(x)) return(list_of_samples) repeat_sample(large_array) However, using high repeats such as the 100 above, results in MemoryError. Traceback is as follows; Traceback (most recent call last): File "C:\Python31\rnd.py", line 221, in <module> STORED_REPEAT_SAMPLE = repeat_sample(STORED_ARRAY) File "C:\Python31\rnd.py", line 129, in repeat_sample list_of_samples.append(sample(x)) File "C:\Python31\rnd.py", line 121, in sample new_array = random.sample((list(x)),length) File "C:\Python31\lib\random.py", line 309, in sample result = [None] * k MemoryError I am assuming I'm running out of memory. I do not know how to get around this problem. Thank you for your time!

    Read the article

  • Exposing classes inside modules within a Python package directly in the package's namespace

    - by Richard Waite
    I have a wxPython application with the various GUI classes in their own modules in a package called gui. With this setup, importing the main window would be done as follows: from gui.mainwindow import MainWindow This looked messy to me so I changed the __init__.py file for the gui package to import the class directly into the package namespace: from mainwindow import MainWindow This allows me to import the main window like this: from gui import MainWindow This looks better to me aesthetically and I think it also more closely represents what I'm doing (importing the MainWindow class from the gui "namespace"). The reason I made the gui package was to keep all the GUI stuff together. I could have just as easily made a single gui module and stuffed all the GUI classes in it, but I think that would have been unmanageable. The package now appears to work like a module, but allows me to separate the classes into their own modules (along with helper functions, etc.). This whole thing strikes me as somewhat petty, I just thought I'd throw it out there to see what others think about the idea.

    Read the article

  • Python combinations no repeat by constraint

    - by user2758113
    I have a tuple of tuples (Name, val 1, val 2, Class) tuple = (("Jackson",10,12,"A"), ("Ryan",10,20,"A"), ("Michael",10,12,"B"), ("Andrew",10,20,"B"), ("McKensie",10,12,"C"), ("Alex",10,20,"D")) I need to return all combinations using itertools combinations that do not repeat classes. How can I return combinations that dont repeat classes. For example, the first returned statement would be: tuple0, tuple2, tuple4, tuple5 and so on.

    Read the article

  • filtering elements from list of lists in Python?

    - by user248237
    I want to filter elements from a list of lists, and iterate over the elements of each element using a lambda. For example, given the list: a = [[1,2,3],[4,5,6]] suppose that I want to keep only elements where the sum of the list is greater than N. I tried writing: filter(lambda x, y, z: x + y + z >= N, a) but I get the error: <lambda>() takes exactly 3 arguments (1 given) How can I iterate while assigning values of each element to x, y, and z? Something like zip, but for arbitrarily long lists. thanks, p.s. I know I can write this using: filter(lambda x: sum(x)..., a) but that's not the point, imagine that these were not numbers but arbitrary elements and I wanted to assign their values to variable names.

    Read the article

< Previous Page | 160 161 162 163 164 165 166 167 168 169 170 171  | Next Page >