Search Results

Search found 13534 results on 542 pages for 'python 3 3'.

Page 389/542 | < Previous Page | 385 386 387 388 389 390 391 392 393 394 395 396  | Next Page >

  • Optimization of Function with Dictionary and Zip()

    - by eWizardII
    Hello, I have the following function: def filetxt(): word_freq = {} lvl1 = [] lvl2 = [] total_t = 0 users = 0 text = [] for l in range(0,500): # Open File if os.path.exists("C:/Twitter/json/user_" + str(l) + ".json") == True: with open("C:/Twitter/json/user_" + str(l) + ".json", "r") as f: text_f = json.load(f) users = users + 1 for i in range(len(text_f)): text.append(text_f[str(i)]['text']) total_t = total_t + 1 else: pass # Filter occ = 0 import string for i in range(len(text)): s = text[i] # Sample string a = re.findall(r'(RT)',s) b = re.findall(r'(@)',s) occ = len(a) + len(b) + occ s = s.encode('utf-8') out = s.translate(string.maketrans("",""), string.punctuation) # Create Wordlist/Dictionary word_list = text[i].lower().split(None) for word in word_list: word_freq[word] = word_freq.get(word, 0) + 1 keys = word_freq.keys() numbo = range(1,len(keys)+1) WList = ', '.join(keys) NList = str(numbo).strip('[]') WList = WList.split(", ") NList = NList.split(", ") W2N = dict(zip(WList, NList)) for k in range (0,len(word_list)): word_list[k] = W2N[word_list[k]] for i in range (0,len(word_list)-1): lvl1.append(word_list[i]) lvl2.append(word_list[i+1]) I have used the profiler to find that it seems the greatest CPU time is spent on the zip() function and the join and split parts of the code, I'm looking to see if there is any way I have overlooked that I could potentially clean up the code to make it more optimized, since the greatest lag seems to be in how I am working with the dictionaries and the zip() function. Any help would be appreciated thanks!

    Read the article

  • Parse raw HTTP Headers

    - by Cev
    I have a string of raw HTTP and I would like to represent the fields in an object. Is there any way to parse the individual headers from an HTTP string? 'GET /search?sourceid=chrome&ie=UTF-8&q=ergterst HTTP/1.1\r\nHost: www.google.com\r\nConnection: keep-alive\r\nAccept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5\r\nUser-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.45 Safari/534.13\r\nAccept-Encoding: gzip,deflate,sdch\r\nAvail-Dictionary: GeNLY2f-\r\nAccept-Language: en-US,en;q=0.8\r\n [...]'

    Read the article

  • What is an efficient way to erase substrings?

    - by Legend
    I have a long string and a set of <end-index, string> list like the following: long_sentence = "This is a long long long long sentence" indices = [[6, "is"], [8, "is a"], [18, "long"], [23, "long"]] An element 6, "is" indicates that 6 is the end index of the word "is" in the string. I want to get the following string in the end: >> print long_sentence This .... long ......... long sentence" I tried an approach like this: temp = long_sentence for i in indices: temp = temp[:int(i[0]) - len(i[1])] + '.'*(len(i[1])+1) + temp[i[0]+1:] While this seems to be working, it is taking exceptionally long time (more than 6 hours on 5000 strings inside a 300 MB file). Is there a way to speed this up?

    Read the article

  • how to set Content-Type automatically when i download the data that i uploaded.

    - by zjm1126
    this is my code : import os from google.appengine.ext import webapp from google.appengine.ext.webapp import template from google.appengine.ext.webapp.util import run_wsgi_app from google.appengine.ext import db #from login import htmlPrefix,get_current_user class MyModel(db.Model): blob = db.BlobProperty() class BaseRequestHandler(webapp.RequestHandler): def render_template(self, filename, template_args=None): if not template_args: template_args = {} path = os.path.join(os.path.dirname(__file__), 'templates', filename) self.response.out.write(template.render(path, template_args)) class upload(BaseRequestHandler): def get(self): self.render_template('index.html',) def post(self): file=self.request.get('file') obj = MyModel() obj.blob = db.Blob(file.encode('utf8')) obj.put() self.response.out.write('upload ok') class download(BaseRequestHandler): def get(self): #id=self.request.get('id') o = MyModel.all().get() #self.response.out.write(''.join('%s: %s <br/>' % (a, getattr(o, a)) for a in dir(o))) self.response.out.write(o) application = webapp.WSGIApplication( [ ('/?', upload), ('/download',download), ], debug=True ) def main(): run_wsgi_app(application) if __name__ == "__main__": main() my index.html is : <form action="/" method="post"> <input type="file" name="file" /> <input type="submit" /> </form> and it show : <__main__.MyModel object at 0x02506830> but ,i don't want to see this , i want to download it , how to change my code to run, thanks updated it is ok now : class upload(BaseRequestHandler): def get(self): self.render_template('index.html',) def post(self): file=self.request.get('file') obj = MyModel() obj.blob = db.Blob(file) obj.put() self.response.out.write('upload ok') class download(BaseRequestHandler): def get(self): #id=self.request.get('id') o = MyModel.all().order('-').get() #self.response.out.write(''.join('%s: %s <br/>' % (a, getattr(o, a)) for a in dir(o))) self.response.headers['Content-Type'] = "image/png" self.response.out.write(o.blob) and new question is : if you upload a 'png' file ,it will show successful , but ,when i upload a rar file ,i will run error , so how to set Content-Type automatically , and what is the Content-Type of the 'rar' file thanks

    Read the article

  • problem on strings, tuple strings

    - by suresh
    Write a function, called constrainedMatchPair which takes three arguments: a tuple representing starting points for the first substring, a tuple representing starting points for the second substring, and the length of the first substring. The function should return a tuple of all members (call it n) of the first tuple for which there is an element in the second tuple (call it k) such that n+m+1 = k, where m is the length of the first substring. Complete the definition def constrainedMatchPair(firstMatch,secondMatch,length):

    Read the article

  • Need to get pixel averages of a vector sitting on a bitmap...

    - by user346511
    I'm currently involved in a hardware project where I am mapping triangular shaped LED to traditional bitmap images. I'd like to overlay a triangle vector onto an image and get the average pixel data within the bounds of that vector. However, I'm unfamiliar with the math needed to calculate this. Does anyone have an algorithm or a link that could send me in the right direction? I'm not even clear what this type of math is called. I've created a basic image of what I'm trying to capture here: http://imgur.com/Isjip.gif

    Read the article

  • Paramiko ssh output stops at --more--

    - by Anesh
    The output stops printing at --more-- any idea how to get the end of the output >>> import paramiko >>> ssh = paramiko.SSHClient() >>> ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) >>> conn=ssh.connect("ipaddress",username="user", password="pass") >>> channel = ssh.invoke_shell() >>> channel.send("en\n") 3 >>> channel.send("password\n") 9 >>> channel.send("show security local-user-list\n") 30 >>> results = '' >>> channel.send("\n") 1 >>> results += channel.recv(5000) >>> print results bluecoat>en Password: bluecoat#show security local-user-list Default List: local_user_database Append users loaded from file to default list: false local_user_database Lockout parameters: Max failed attempts: 60 Lockout duration: 3600 Reset interval: 7200 Users: Groups: admin_local Lockout parameters: Max failed attempts: 60 Lockout duration: 3600 Reset interval: 7200 Users: <username> Hashed Password: Enabled: true Groups: <username> Hashed Password: Enabled: true **--More--** As you can see above the output stops printing at --more-- any idea how to get the output to print till the end.

    Read the article

  • scrapping blog contents

    - by goh
    Hi lads, After obtaining the urls for various blogspots, tumblr and wordpress pages, I faced some problems processing the html pages. The thing is, i wish to distinguish between the content,title and date for each blog post. I might be able to get the date through regex, but there are so many custom scripts people are using now that the html classes and structure is so different. Does anyone has a solution that may help?

    Read the article

  • Lucene: Fastest way to return the document occurance of a phrase?

    - by dont say the kid's name
    Hi Guys, I am trying to use Lucene (actually PyLucene!) to find out how many documents contain my exact phrase. My code currently looks like this... but it runs rather slow. Does anyone know a faster way to return document counts? phraseList = ["some phrase 1", "some phrase 2"] #etc, a list of phrases... countsearcher = IndexSearcher(SimpleFSDirectory(File(STORE_DIR)), True) analyzer = StandardAnalyzer(Version.LUCENE_CURRENT) for phrase in phraseList: query = QueryParser(Version.LUCENE_CURRENT, "contents", analyzer).parse("\"" + phrase + "\"") scoreDocs = countsearcher.search(query, 200).scoreDocs print "count is: " + str(len(scoreDocs))

    Read the article

  • Qt: How to autoexpand parents of a new QTreeView item when using a QSortFilterProxyModel

    - by taynaron
    I'm making an app wherein the user can add new data to a QTreeModel at any time. The parent under which it gets placed is automatically expanded to show the new item: self.tree = DiceModel(headers) self.treeView.setModel(self.tree) expand_node = self.tree.addRoll() #addRoll makes a node, adds it, and returns the (parent) note to be expanded self.treeView.expand(expand_node) This works as desired. If I add a QSortFilterProxyModel to the mix: self.tree = DiceModel(headers) self.sort = DiceSort(self.tree) self.treeView.setModel(self.sort) expand_node = self.tree.addRoll() #addRoll makes a node, adds it, and returns the (parent) note to be expanded self.treeView.expand(expand_node) the parent no longer expands. Any idea what I am doing wrong?

    Read the article

  • Interesting task using random numbers only

    - by psihodelia
    Given any number of the random real numbers from the interval [0,1] is there exist any method to construct a floating point number with zero decimal part? Your algorithm can use only random() function calls and no variables or constants. No constants and variables are allowed, no type casting is allowed. You can use for/while, if/else or any other programming language operands.

    Read the article

  • Removing the port number from URL

    - by DrewSSP
    I'm new to anything related to servers and am trying to deploy a django application. Today I bought a domain name for the app and am having trouble configuring it so that the base URL does not need the port number at the end of it. I have to type www.trackthecharts.com:8001 to see the website when I only want to use www.trackethecharts.com. I think the problem is somewhere in my nginx, gunicorn or supervisor configuration. gunicorn_config.py command = '/opt/myenv/bin/gunicorn' pythonpath = '/opt/myenv/top-chart-app/' bind = '162.243.76.202:8001' workers = 3 root@django-app:~# nginx config server { server_name 162.243.76.202; access_log off; location /static/ { alias /opt/myenv/static/; } location / { proxy_pass http://127.0.0.1:8001; proxy_set_header X-Forwarded-Host $server_name; proxy_set_header X-Real-IP $remote_addr; add_header P3P 'CP="ALL DSP COR PSAa PSDa OUR NOR ONL UNI COM NAV"'; } } supervisor config [program:top_chart_gunicorn] command=/opt/myenv/bin/gunicorn -c /opt/myenv/gunicorn_config.py djangoTopChartApp.wsgi autostart=true autorestart=true stderr_logfile=/var/log/supervisor_gunicorn.err.log stdout_logfile=/var/log/supervisor_gunicorn.out.log Thanks for taking a look.

    Read the article

  • How can I configure different worker pools using celery?

    - by Chris R
    I need to deploy a queued execution service with (generally) the following three classes of worker: A periodic, low-priority job class that takes a long time and can be processed serially; these jobs should only use 0..2 workers in the system at most. A periodic, deadline-sensitive job class that take a short to medium amount of time (say, topping out at 5 minutes) An ad-hoc job class, that is higher priority than #1, but can interleave with #2. Any workers from class #2 that are inactive when this type of job comes in should handle it, without ever starving the pool of workers for #2 All three job classes are the same task, the only difference between them is how they're requested; they'll take the same input and generate the same output, but each one has different performance guarantees. How can I implement this using celery?

    Read the article

  • SQLAlchemy sessions - DetachedInstanceError?

    - by benjaminhkaiser
    I have a function that attempts to take a list of usernames, look each one up in a user table, and then add them to a membership table. If even one username is invalid, I want the entire list to be rolled back, including any users that have already been processed. I thought that using sessions was the best way to do this but I'm running into a DetachedInstanceError: DetachedInstanceError: Instance <Organization at 0x7fc35cb5df90> is not bound to a Session; attribute refresh operation cannot proceed Full stack trace is here. The error seems to trigger when I attempt to access the user (model) object that is returned by the query. From my reading I understand that it has something to do with there being multiple sessions, but none of the suggestions I saw on other threads worked for me. Code is below: def add_members_in_bulk(organization_eid, users): """Add users to an organization in bulk - helper function for add_member()""" """Returns "success" on success and id of first failed student on failure""" session = query_session.get_session() session.begin_nested() users = users.split('\n') for u in users: try: user = user_lookup.by_student_id(u) except ObjectNotFoundError: session.rollback() return u if user: membership.add_user_to_organization( user.entity_id, organization_eid, '', [] ) session.flush() session.commit() return 'success' here's the membership.add_user_to_organization: def add_user_to_organization(user_eid, organization_eid, title, tag_ids): """Add a User to an Organization with the given title""" user = user_lookup.by_eid(user_eid) organization = organization_lookup.by_eid(organization_eid) new_membership = OrganizationMembership( organization_eid=organization.entity_id, user_eid=user.entity_id, title=title) new_membership.tags = [get_tag_by_id(tag_id) for tag_id in tag_ids] crud.add(new_membership) and here is the lookup by ID query: def by_student_id(student_id, include_disabled=False): """Get User by RIN""" try: return get_query_set(include_disabled).filter(User.student_id == student_id).one() except NoResultFound: raise ObjectNotFoundError("User with RIN %s does not exist." % student_id)

    Read the article

  • Strip text except from the contents of a tag

    - by myle
    The opposite may be achieved using pyparsing as follows: from pyparsing import Suppress, replaceWith, makeHTMLTags, SkipTo #... removeText = replaceWith("") scriptOpen, scriptClose = makeHTMLTags("script") scriptBody = scriptOpen + SkipTo(scriptClose) + scriptClose scriptBody.setParseAction(removeText) data = (scriptBody).transformString(data) How could I keep the contents of the tag "table"?

    Read the article

  • supervisord environment variables setting up application

    - by user1434844
    I'm running an application from supervisord and I have to set up an environment for it. There are about 30 environment variables that need to be set. I've tried putting all on one big environment= line and that doesn't seem to work. I've also tried multiple enviroment= lines, and that doesn't seem to work either. I've also tried both with and without ' around the env value. What's the best way to set up my environment such that it remains intact under supervisord control? Should I be calling my actual program (tornado, fwiw) from a shell script with the environment preloaded there? Ideally, I'd like to put all of the enviroment variables into an include file and load them with supervisor, but I'm open to doing it another way.

    Read the article

  • which file stored os.environ,and store where , disk c: or disk d:

    - by zjm1126
    my code is : os.environ['ss']='ssss' print os.environ and it show : {'TMP': 'C:\\DOCUME~1\\ADMINI~1\\LOCALS~1\\Temp', 'COMPUTERNAME': 'PC-200908062210', 'USERDOMAIN': 'PC-200908062210', 'COMMONPROGRAMFILES': 'C:\\Program Files\\Common Files', 'PROCESSOR_IDENTIFIER': 'x86 Family 6 Model 15 Stepping 2, GenuineIntel', 'PROGRAMFILES': 'C:\\Program Files', 'PROCESSOR_REVISION': '0f02', 'SYSTEMROOT': 'C:\\WINDOWS', 'PATH': 'C:\\WINDOWS\\system32;C:\\WINDOWS;C:\\WINDOWS\\System32\\Wbem;C:\\Program Files\\Hewlett-Packard\\IAM\\bin;C:\\Program Files\\Common Files\\Thunder Network\\KanKan\\Codecs;D:\\Program Files\\TortoiseSVN\\bin;d:\\Program Files\\Mercurial\\;D:\\Program Files\\Graphviz2.26.3\\bin;D:\\TDDOWNLOAD\\ok\\gettext\\bin;D:\\Python25;C:\\Program Files\\StormII\\Codec;C:\\Program Files\\StormII;D:\\zjm_code\\;D:\\Python25\\Scripts;D:\\MinGW\\bin;d:\\Program Files\\Google\\google_appengine\\', 'TEMP': 'C:\\DOCUME~1\\ADMINI~1\\LOCALS~1\\Temp', 'BID': '56727834-D5C3-4EBF-BFAA-FA0933E4E721', 'PROCESSOR_ARCHITECTURE': 'x86', 'ALLUSERSPROFILE': 'C:\\Documents and Settings\\All Users', 'SESSIONNAME': 'Console', 'HOMEPATH': '\\Documents and Settings\\Administrator', 'USERNAME': 'Administrator', 'LOGONSERVER': '\\\\PC-200908062210', 'COMSPEC': 'C:\\WINDOWS\\system32\\cmd.exe', 'PATHEXT': '.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH', 'CLIENTNAME': 'Console', 'FP_NO_HOST_CHECK': 'NO', 'WINDIR': 'C:\\WINDOWS', 'APPDATA': 'C:\\Documents and Settings\\Administrator\\Application Data', 'HOMEDRIVE': 'C:', 'SS': 'ssss', 'SYSTEMDRIVE': 'C:', 'NUMBER_OF_PROCESSORS': '2', 'PROCESSOR_LEVEL': '6', 'OS': 'Windows_NT', 'USERPROFILE': 'C:\\Documents and Settings\\Administrator'} i find google-app-engine set user_id in os.version not in session,look here at line 96-100 and line 257 , and aeoid at line 177 , and i want to know : which file stored os.environ ,and store where , disk c: ,or disk d: ? thanks

    Read the article

  • Which style of return is "better" for a method that might return None?

    - by Daenyth
    I have a method that will either return an object or None if the lookup fails. Which style of the following is better? def get_foo(needle): haystack = object_dict() if needle not in haystack: return None return haystack[needle] or, def get_foo(needle): haystack = object_dict() try: return haystack[needle] except KeyError: # Needle not found return None I'm undecided as to which is more more desirable myself. Another choice would be return haystack[needle] if needle in haystack else None, but I'm not sure that's any better.

    Read the article

  • Online job-searching is tedious. Help me automate it.

    - by ehsanul
    Many job sites have broken searches that don't let you narrow down jobs by experience level. Even when they do, it's usually wrong. This requires you to wade through hundreds of postings that you can't apply for before finding a relevant one, quite tedious. Since I'd rather focus on writing cover letters etc., I want to write a program to look through a large number of postings, and save the URLs of just those jobs that don't require years of experience. I don't require help writing the scraper to get the html bodies of possibly relevant job posts. The issue is accurately detecting the level of experience required for the job. This should not be too difficult as job posts are usually very explicit about this ("must have 5 years experience in..."), but there may be some issues with overly simple solutions. In my case, I'm looking for entry-level positions. Often they don't say "entry-level", but inclusion of the words probably means the job should be saved. Next, I can safely exclude a job the says it requires "5 years" of experience in whatever, so a regex like /\d\syears/ seems reasonable to exclude jobs. But then, I realized some jobs say they'll take 0-2 years of experience, matches the exclusion regex but is clearly a job I want to take a look at. Hmmm, I can handle that with another regex. But some say "less than 2 years" or "fewer than 2 years". Can handle that too, but it makes me wonder what other patterns I'm not thinking of, and possibly excluding many jobs. That's what brings me here, to find a better way to do this than regexes, if there is one. I'd like to minimize the false negative rate and save all the jobs that seem like they might not require many years of experience. Does excluding anything that matches /[3-9]\syears|1\d\syears/ seem reasonable? Or is there a better way? Training a bayesian filter maybe?

    Read the article

  • Turbogears 2 vs Django - any advice on choosing replacement for Turbogears 1?

    - by michela
    I have been using Turbogears 1 for prototyping small sites for the last couple of years and it is getting a little long in the tooth. Any suggestions on making the call between upgrading to Turbogears 2 or switching to something like Django? I'm torn between the familiarity of the TG community who are pretty responsive and do pretty good documentation vs the far larger community using Django. I am quite tempted by the built-in CMS features and the Google AppEngine support. Any advice? Thanks .M.

    Read the article

  • Django context processor gets AnonymousUser

    - by myfreeweb
    instead of User. def myview(request): return render_to_response('tmpl.html', {'user': User.objects.get(id=1}) works fine and passes User to template. But def myview(request): return render_to_response('tmpl.html', {}, context_instance=RequestContext(request)) with a context processor def user(request): from django.contrib.auth.models import User return {'user': User.objects.get(id=1)} passes AnonymousUser, so I can't get the variables I need :( What's wrong?

    Read the article

< Previous Page | 385 386 387 388 389 390 391 392 393 394 395 396  | Next Page >