How do I split on all nonalphanumeric characters, EXCEPT the apostrophe?
re.split('\W+',text)
works, but will also split on apostrophes. How do I add an exception to this rule?
Thanks!
Hi,
I have some strings that I want to delete some unwanted characters from them.
For example: Adam'sApple ---- AdamsApple.(case insensitive)
Can someone help me, I need the fastest way to do it, cause I have a couple of millions of records that have to be polished.
Thanks
Trying to integrate openmeetings with django website, but can't understand how properly configure ImportDoctor:
(here :// replaced with __ 'cause spam protection)
print url
http://sovershenstvo.com.ua:5080/openmeetings/services/UserService?wsdl
imp = Import('http__schemas.xmlsoap.org/soap/encoding/')
imp.filter.add('http__services.axis.openmeetings.org')
imp.filter.add('http__basic.beans.hibernate.app.openmeetings.org/xsd')
imp.filter.add('http__basic.beans.data.app.openmeetings.org/xsd')
imp.filter.add('http__services.axis.openmeetings.org')
d = ImportDoctor(imp)
client = Client(url, doctor = d)
client.service.getSession()
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python2.6/site-packages/suds/client.py", line 539, in call
return client.invoke(args, kwargs)
File "/usr/lib/python2.6/site-packages/suds/client.py", line 598, in invoke
result = self.send(msg)
File "/usr/lib/python2.6/site-packages/suds/client.py", line 627, in send
result = self.succeeded(binding, reply.message)
File "/usr/lib/python2.6/site-packages/suds/client.py", line 659, in succeeded
r, p = binding.get_reply(self.method, reply)
File "/usr/lib/python2.6/site-packages/suds/bindings/binding.py", line 159, in get_reply
resolved = rtypes[0].resolve(nobuiltin=True)
File "/usr/lib/python2.6/site-packages/suds/xsd/sxbasic.py", line 63, in resolve
raise TypeNotFound(qref)
suds.TypeNotFound: Type not found: '(Sessiondata, http__basic.beans.hibernate.app.openmeetings.org/xsd, )'
what i'm doing wrong? please help and sorry for my english, but you are my last chance to save position :(
need webinars at morning (2.26 am now)
I am trying to search hindi words contained one line per file in file-1 and find them in lines in file-2. I have to print the line numbers with the number of words found.
This is the code:
import codecs
hypernyms = codecs.open("hindi_hypernym.txt", "r", "utf-8").readlines()
words = codecs.open("hypernyms_en2hi.txt", "r", "utf-8").readlines()
count_arr = []
for counter, line in enumerate(hypernyms):
count_arr.append(0)
for word in words:
if line.find(word) >=0:
count_arr[counter] +=1
for iterator, count in enumerate(count_arr):
if count>0:
print iterator, ' ', count
This is finding some words, but ignoring some others
The input files are:
File-1:
????
???????
File-2:
???????, ????-????
?????-???, ?????-???, ?????_???, ?????_???
????_????, ????-????, ???????_????
????-????
This gives output:
0 1
3 1
Clearly, it is ignoring ??????? and searching for ???? only. I have tried with other inputs as well. It only searches for one word. Any idea how to correct this?
The input list can be more than 1 million numbers. When I run the following code with smaller 'repeats', its fine;
def sample(x):
length = 1000000
new_array = random.sample((list(x)),length)
return (new_array)
def repeat_sample(x):
i = 0
repeats = 100
list_of_samples = []
for i in range(repeats):
list_of_samples.append(sample(x))
return(list_of_samples)
repeat_sample(large_array)
However, using high repeats such as the 100 above, results in MemoryError. Traceback is as follows;
Traceback (most recent call last):
File "C:\Python31\rnd.py", line 221, in <module>
STORED_REPEAT_SAMPLE = repeat_sample(STORED_ARRAY)
File "C:\Python31\rnd.py", line 129, in repeat_sample
list_of_samples.append(sample(x))
File "C:\Python31\rnd.py", line 121, in sample
new_array = random.sample((list(x)),length)
File "C:\Python31\lib\random.py", line 309, in sample
result = [None] * k
MemoryError
I am assuming I'm running out of memory. I do not know how to get around this problem.
Thank you for your time!
I'd like to call a query with a field name filter that I wont know before run time... Not sure how to construct the variable name ...Or maybe I am tired.
field_name = funct()
locations = Locations.objects.filter(field_name__lte=arg1)
where if funct() returns name would equal to
locations = Locations.objects.filter(name__lte=arg1)
Not sure how to do that ...
I have an array of files. I'd like to be able to break that array down into one array with multiple subarrays, each subarray contains files that were created on the same day. So right now if the array contains files from March 1 - March 31, I'd like to have an array with 31 subarrays (assuming there is at least 1 file for each day).
In the long run, I'm trying to find the file from each day with the latest creation/modification time. If there is a way to bundle that into the iterations that are required above to save some CPU cycles, that would be even more ideal. Then I'd have one flat array with 31 files, one for each day, for the latest file created on each individual day.
I have code that uses the BeautifulSoup library for parsing, but it is very slow. The code is written in such a way that threads cannot be used.
Can anyone help me with this?
I am using BeautifulSoup for parsing and than save into a DB. If I comment out the save statement, it still takes a long time, so there is no problem with the database.
def parse(self,text):
soup = BeautifulSoup(text)
arr = soup.findAll('tbody')
for i in range(0,len(arr)-1):
data=Data()
soup2 = BeautifulSoup(str(arr[i]))
arr2 = soup2.findAll('td')
c=0
for j in arr2:
if str(j).find("<a href=") > 0:
data.sourceURL = self.getAttributeValue(str(j),'<a href="')
else:
if c == 2:
data.Hits=j.renderContents()
#and few others...
c = c+1
data.save()
Any suggestions?
Note: I already ask this question here but that was closed due to incomplete information.
Is it possible a lambda function to have variable number of arguments?
For example, I want to write a metaclass, which creates a method for every method of some other class and this newly created method returns the opposite value of the original method and has the same number of arguments.
And I want to do this with lambda function. How to pass the arguments? Is it possible?
class Negate(type):
def __new__(mcs, name, bases, _dict):
extended_dict = _dict.copy()
for (k, v) in _dict.items():
if hasattr(v, '__call__'):
extended_dict["not_" + k] = lambda s, *args, **kw: not v(s, *args, **kw)
return type.__new__(mcs, name, bases, extended_dict)
class P(metaclass=Negate):
def __init__(self, a):
self.a = a
def yes(self):
return True
def maybe(self, you_can_chose):
return you_can_chose
But the result is totally wrong:
>>>p = P(0)
>>>p.yes()
True
>>>p.not_yes() # should be False
Traceback (most recent call last):
File "<pyshell#150>", line 1, in <module>
p.not_yes()
File "C:\Users\Nona\Desktop\p10.py", line 51, in <lambda>
extended_dict["not_" + k] = lambda s, *args, **kw: not v(s, *args, **kw)
TypeError: __init__() takes exactly 2 positional arguments (1 given)
>>>p.maybe(True)
True
>>>p.not_maybe(True) #should be False
True
I'm looking for the most efficient way to add an element to a comma-separated string while maintaining alphabetical order for the words:
For example:
string = 'Apples, Bananas, Grapes, Oranges'
subtraction = 'Bananas'
result = 'Apples, Grapes, Oranges'
Also, a way to do this but while maintaining IDs:
string = '1:Apples, 4:Bananas, 6:Grapes, 23:Oranges'
subtraction = '4:Bananas'
result = '1:Apples, 6:Grapes, 23:Oranges'
Sample code is greatly appreciated. Thank you so much.
Any time I want to replace a piece of text that is part of a larger piece of text, I always have to do something like:
"(?P<start>some_pattern)(?P<replace>foo)(?P<end>end)"
And then concatenate the start group with the new data for replace and then the end group.
Is there a better method for this?
Is there a more Pythonic way to put this loop together?:
while True:
children = tree.getChildren()
if not children:
break
tree = children[0]
UPDATE:
I think this syntax is probably what I'm going to go with:
while tree.getChildren():
tree = tree.getChildren()[0]
Here's the deal. I'm trying to write an arkanoid clone game and the thing is that I need a window menu like you get in pyGTK. For example File-(Open/Save/Exit) .. something like that and opening an "about" context where the author should be written.
I'm already using pyGame for writting the game logic. I've tried pgu to write the GUI but that doesn't help me, altough it has those menu elements I'm taking about, you can't include the screen of the game in it's container.
Does anybody know how to include such window menus with the usage of pyGame ?
I have a large graph that I am generating in matplotlib. I'd like to add a number of icons to this graph at certain (x,y) coordinates. I am wondering if there is any way to do that in matplotlib
Thank you
First off, I'm relatively new to Google App Engine, so I'm probably doing something silly.
Say I've got a model Foo:
class Foo(db.Model):
name = db.StringProperty()
I want to use name as a unique key for every Foo object. How is this done?
When I want to get a specific Foo object, I currently query the datastore for all Foo objects with the target unique name, but queries are slow (plus it's a pain to ensure that name is unique when each new Foo is created).
There's got to be a better way to do this!
Thanks.
I'm trying to write a script to import a database file. I wrote the script to export the file like so:
import sqlite3
con = sqlite3.connect('../sqlite.db')
with open('../dump.sql', 'w') as f:
for line in con.iterdump():
f.write('%s\n' % line)
Now I want to be able to import that database. I tried:
import sqlite3
con = sqlite3.connect('../sqlite.db')
f = open('../dump.sql','r')
str = f.read()
con.execute(str)
but I'm not allowed to execute more than one statement. Is there a way to get it to run a .sql script directly?
I'm looking for a way to compress an ascii-based string, any help?
I need also need to decompress it. I tried zlib but with no help.
What can I do to compress the string into lesser length?
code:
def compress(request):
if request.POST:
data = request.POST.get('input')
if is_ascii(data):
result = zlib.compress(data)
return render_to_response('index.html', {'result': result, 'input':data}, context_instance = RequestContext(request))
else:
result = "Error, the string is not ascii-based"
return render_to_response('index.html', {'result':result}, context_instance = RequestContext(request))
else:
return render_to_response('index.html', {}, context_instance = RequestContext(request))
I have a class that dynamically overloads basic arithmetic operators like so...
import operator
class IshyNum:
def __init__(self, n):
self.num=n
self.buildArith()
def arithmetic(self, other, o):
return o(self.num, other)
def buildArith(self):
map(lambda o: setattr(self, "__%s__"%o,lambda f: self.arithmetic(f, getattr(operator, o))), ["add", "sub", "mul", "div"])
if __name__=="__main__":
number=IshyNum(5)
print number+5
print number/2
print number*3
print number-3
But if I change the class to inherit from the dictionary (class IshyNum(dict):) it doesn't work. I need to explicitly def __add__(self, other) or whatever in order for this to work. Why?
from google.appengine.ext import db
class Log(db.Model):
content = db.StringProperty(multiline=True)
class MyThread(threading.Thread):
def run(self,request):
#logs_query = Log.all().order('-date')
#logs = logs_query.fetch(3)
log=Log()
log.content=request.POST.get('content',None)
log.put()
def Log(request):
thr = MyThread()
thr.start(request)
return HttpResponse('')
error is :
Exception in thread Thread-1:
Traceback (most recent call last):
File "D:\Python25\lib\threading.py", line 486, in __bootstrap_inner
self.run()
File "D:\zjm_code\helloworld\views.py", line 33, in run
log.content=request.POST.get('content',None)
NameError: global name 'request' is not defined
I know that I can dynamically add an instance method to an object by doing something like:
import types
def my_method(self):
# logic of method
# ...
# instance is some instance of some class
instance.my_method = types.MethodType(my_method, instance)
Later on I can call instance.my_method() and self will be bound correctly and everything works.
Now, my question: how to do the exact same thing to obtain the behavior that decorating the new method with @property would give?
I would guess something like:
instance.my_method = types.MethodType(my_method, instance)
instance.my_method = property(instance.my_method)
But, doing that instance.my_method returns a property object.
Hello everybody,
I have two nested lists of different sizes:
A = [[1, 7, 3, 5], [5, 5, 14, 10]]
B = [[1, 17, 3, 5], [1487, 34, 14, 74], [1487, 34, 3, 87], [141, 25, 14, 10]]
I'd like to gather all nested lists from list B if A[2:4] == B[2:4] and put it into list L:
L = [[1, 17, 3, 5], [141, 25, 14, 10]]
Would you help me with this?
I'm trying to use reserved words in my grammar:
reserved = {
'if' : 'IF',
'then' : 'THEN',
'else' : 'ELSE',
'while' : 'WHILE',
}
tokens = [
'DEPT_CODE',
'COURSE_NUMBER',
'OR_CONJ',
'ID',
] + list(reserved.values())
t_DEPT_CODE = r'[A-Z]{2,}'
t_COURSE_NUMBER = r'[0-9]{4}'
t_OR_CONJ = r'or'
t_ignore = ' \t'
def t_ID(t):
r'[a-zA-Z_][a-zA-Z_0-9]*'
if t.value in reserved.values():
t.type = reserved[t.value]
return t
return None
However, the t_ID rule somehow swallows up DEPT_CODE and OR_CONJ. How can I get around this? I'd like those two to take higher precedence than the reserved words.
I use the following code to log a map, it is fast when it only contains zeroes, but as soon as there is actual data in the map it becomes unbearably slow... Is there any way to do this faster?
log_file = open('testfile', 'w')
for i, x in ((i, start + i * interval) for i in range(length)):
log_file.write('%-5d %8.3f %13g %13g %13g %13g %13g %13g\n' % (i, x,
map[0][i], map[1][i], map[2][i], map[3][i], map[4][i], map[5][i]))