How do I split on all nonalphanumeric characters, EXCEPT the apostrophe?
re.split('\W+',text)
works, but will also split on apostrophes. How do I add an exception to this rule?
Thanks!
Hi,
I have some strings that I want to delete some unwanted characters from them.
For example: Adam'sApple ---- AdamsApple.(case insensitive)
Can someone help me, I need the fastest way to do it, cause I have a couple of millions of records that have to be polished.
Thanks
I have a tuple of tuples (Name, val 1, val 2, Class)
tuple = (("Jackson",10,12,"A"),
("Ryan",10,20,"A"),
("Michael",10,12,"B"),
("Andrew",10,20,"B"),
("McKensie",10,12,"C"),
("Alex",10,20,"D"))
I need to return all combinations using itertools combinations that do not repeat classes. How can I return combinations that dont repeat classes. For example, the first returned statement would be: tuple0, tuple2, tuple4, tuple5 and so on.
Trying to integrate openmeetings with django website, but can't understand how properly configure ImportDoctor:
(here :// replaced with __ 'cause spam protection)
print url
http://sovershenstvo.com.ua:5080/openmeetings/services/UserService?wsdl
imp = Import('http__schemas.xmlsoap.org/soap/encoding/')
imp.filter.add('http__services.axis.openmeetings.org')
imp.filter.add('http__basic.beans.hibernate.app.openmeetings.org/xsd')
imp.filter.add('http__basic.beans.data.app.openmeetings.org/xsd')
imp.filter.add('http__services.axis.openmeetings.org')
d = ImportDoctor(imp)
client = Client(url, doctor = d)
client.service.getSession()
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python2.6/site-packages/suds/client.py", line 539, in call
return client.invoke(args, kwargs)
File "/usr/lib/python2.6/site-packages/suds/client.py", line 598, in invoke
result = self.send(msg)
File "/usr/lib/python2.6/site-packages/suds/client.py", line 627, in send
result = self.succeeded(binding, reply.message)
File "/usr/lib/python2.6/site-packages/suds/client.py", line 659, in succeeded
r, p = binding.get_reply(self.method, reply)
File "/usr/lib/python2.6/site-packages/suds/bindings/binding.py", line 159, in get_reply
resolved = rtypes[0].resolve(nobuiltin=True)
File "/usr/lib/python2.6/site-packages/suds/xsd/sxbasic.py", line 63, in resolve
raise TypeNotFound(qref)
suds.TypeNotFound: Type not found: '(Sessiondata, http__basic.beans.hibernate.app.openmeetings.org/xsd, )'
what i'm doing wrong? please help and sorry for my english, but you are my last chance to save position :(
need webinars at morning (2.26 am now)
The input list can be more than 1 million numbers. When I run the following code with smaller 'repeats', its fine;
def sample(x):
length = 1000000
new_array = random.sample((list(x)),length)
return (new_array)
def repeat_sample(x):
i = 0
repeats = 100
list_of_samples = []
for i in range(repeats):
list_of_samples.append(sample(x))
return(list_of_samples)
repeat_sample(large_array)
However, using high repeats such as the 100 above, results in MemoryError. Traceback is as follows;
Traceback (most recent call last):
File "C:\Python31\rnd.py", line 221, in <module>
STORED_REPEAT_SAMPLE = repeat_sample(STORED_ARRAY)
File "C:\Python31\rnd.py", line 129, in repeat_sample
list_of_samples.append(sample(x))
File "C:\Python31\rnd.py", line 121, in sample
new_array = random.sample((list(x)),length)
File "C:\Python31\lib\random.py", line 309, in sample
result = [None] * k
MemoryError
I am assuming I'm running out of memory. I do not know how to get around this problem.
Thank you for your time!
Any time I want to replace a piece of text that is part of a larger piece of text, I always have to do something like:
"(?P<start>some_pattern)(?P<replace>foo)(?P<end>end)"
And then concatenate the start group with the new data for replace and then the end group.
Is there a better method for this?
I'm looking for the most efficient way to add an element to a comma-separated string while maintaining alphabetical order for the words:
For example:
string = 'Apples, Bananas, Grapes, Oranges'
subtraction = 'Bananas'
result = 'Apples, Grapes, Oranges'
Also, a way to do this but while maintaining IDs:
string = '1:Apples, 4:Bananas, 6:Grapes, 23:Oranges'
subtraction = '4:Bananas'
result = '1:Apples, 6:Grapes, 23:Oranges'
Sample code is greatly appreciated. Thank you so much.
Is it possible a lambda function to have variable number of arguments?
For example, I want to write a metaclass, which creates a method for every method of some other class and this newly created method returns the opposite value of the original method and has the same number of arguments.
And I want to do this with lambda function. How to pass the arguments? Is it possible?
class Negate(type):
def __new__(mcs, name, bases, _dict):
extended_dict = _dict.copy()
for (k, v) in _dict.items():
if hasattr(v, '__call__'):
extended_dict["not_" + k] = lambda s, *args, **kw: not v(s, *args, **kw)
return type.__new__(mcs, name, bases, extended_dict)
class P(metaclass=Negate):
def __init__(self, a):
self.a = a
def yes(self):
return True
def maybe(self, you_can_chose):
return you_can_chose
But the result is totally wrong:
>>>p = P(0)
>>>p.yes()
True
>>>p.not_yes() # should be False
Traceback (most recent call last):
File "<pyshell#150>", line 1, in <module>
p.not_yes()
File "C:\Users\Nona\Desktop\p10.py", line 51, in <lambda>
extended_dict["not_" + k] = lambda s, *args, **kw: not v(s, *args, **kw)
TypeError: __init__() takes exactly 2 positional arguments (1 given)
>>>p.maybe(True)
True
>>>p.not_maybe(True) #should be False
True
I have code that uses the BeautifulSoup library for parsing, but it is very slow. The code is written in such a way that threads cannot be used.
Can anyone help me with this?
I am using BeautifulSoup for parsing and than save into a DB. If I comment out the save statement, it still takes a long time, so there is no problem with the database.
def parse(self,text):
soup = BeautifulSoup(text)
arr = soup.findAll('tbody')
for i in range(0,len(arr)-1):
data=Data()
soup2 = BeautifulSoup(str(arr[i]))
arr2 = soup2.findAll('td')
c=0
for j in arr2:
if str(j).find("<a href=") > 0:
data.sourceURL = self.getAttributeValue(str(j),'<a href="')
else:
if c == 2:
data.Hits=j.renderContents()
#and few others...
c = c+1
data.save()
Any suggestions?
Note: I already ask this question here but that was closed due to incomplete information.
I'd like to call a query with a field name filter that I wont know before run time... Not sure how to construct the variable name ...Or maybe I am tired.
field_name = funct()
locations = Locations.objects.filter(field_name__lte=arg1)
where if funct() returns name would equal to
locations = Locations.objects.filter(name__lte=arg1)
Not sure how to do that ...
I have an array of files. I'd like to be able to break that array down into one array with multiple subarrays, each subarray contains files that were created on the same day. So right now if the array contains files from March 1 - March 31, I'd like to have an array with 31 subarrays (assuming there is at least 1 file for each day).
In the long run, I'm trying to find the file from each day with the latest creation/modification time. If there is a way to bundle that into the iterations that are required above to save some CPU cycles, that would be even more ideal. Then I'd have one flat array with 31 files, one for each day, for the latest file created on each individual day.
Here's the deal. I'm trying to write an arkanoid clone game and the thing is that I need a window menu like you get in pyGTK. For example File-(Open/Save/Exit) .. something like that and opening an "about" context where the author should be written.
I'm already using pyGame for writting the game logic. I've tried pgu to write the GUI but that doesn't help me, altough it has those menu elements I'm taking about, you can't include the screen of the game in it's container.
Does anybody know how to include such window menus with the usage of pyGame ?
Is there a more Pythonic way to put this loop together?:
while True:
children = tree.getChildren()
if not children:
break
tree = children[0]
UPDATE:
I think this syntax is probably what I'm going to go with:
while tree.getChildren():
tree = tree.getChildren()[0]
First off, I'm relatively new to Google App Engine, so I'm probably doing something silly.
Say I've got a model Foo:
class Foo(db.Model):
name = db.StringProperty()
I want to use name as a unique key for every Foo object. How is this done?
When I want to get a specific Foo object, I currently query the datastore for all Foo objects with the target unique name, but queries are slow (plus it's a pain to ensure that name is unique when each new Foo is created).
There's got to be a better way to do this!
Thanks.
I have a large graph that I am generating in matplotlib. I'd like to add a number of icons to this graph at certain (x,y) coordinates. I am wondering if there is any way to do that in matplotlib
Thank you
I'm looking for a way to compress an ascii-based string, any help?
I need also need to decompress it. I tried zlib but with no help.
What can I do to compress the string into lesser length?
code:
def compress(request):
if request.POST:
data = request.POST.get('input')
if is_ascii(data):
result = zlib.compress(data)
return render_to_response('index.html', {'result': result, 'input':data}, context_instance = RequestContext(request))
else:
result = "Error, the string is not ascii-based"
return render_to_response('index.html', {'result':result}, context_instance = RequestContext(request))
else:
return render_to_response('index.html', {}, context_instance = RequestContext(request))
I'm trying to use reserved words in my grammar:
reserved = {
'if' : 'IF',
'then' : 'THEN',
'else' : 'ELSE',
'while' : 'WHILE',
}
tokens = [
'DEPT_CODE',
'COURSE_NUMBER',
'OR_CONJ',
'ID',
] + list(reserved.values())
t_DEPT_CODE = r'[A-Z]{2,}'
t_COURSE_NUMBER = r'[0-9]{4}'
t_OR_CONJ = r'or'
t_ignore = ' \t'
def t_ID(t):
r'[a-zA-Z_][a-zA-Z_0-9]*'
if t.value in reserved.values():
t.type = reserved[t.value]
return t
return None
However, the t_ID rule somehow swallows up DEPT_CODE and OR_CONJ. How can I get around this? I'd like those two to take higher precedence than the reserved words.
I have a class that dynamically overloads basic arithmetic operators like so...
import operator
class IshyNum:
def __init__(self, n):
self.num=n
self.buildArith()
def arithmetic(self, other, o):
return o(self.num, other)
def buildArith(self):
map(lambda o: setattr(self, "__%s__"%o,lambda f: self.arithmetic(f, getattr(operator, o))), ["add", "sub", "mul", "div"])
if __name__=="__main__":
number=IshyNum(5)
print number+5
print number/2
print number*3
print number-3
But if I change the class to inherit from the dictionary (class IshyNum(dict):) it doesn't work. I need to explicitly def __add__(self, other) or whatever in order for this to work. Why?
from google.appengine.ext import db
class Log(db.Model):
content = db.StringProperty(multiline=True)
class MyThread(threading.Thread):
def run(self,request):
#logs_query = Log.all().order('-date')
#logs = logs_query.fetch(3)
log=Log()
log.content=request.POST.get('content',None)
log.put()
def Log(request):
thr = MyThread()
thr.start(request)
return HttpResponse('')
error is :
Exception in thread Thread-1:
Traceback (most recent call last):
File "D:\Python25\lib\threading.py", line 486, in __bootstrap_inner
self.run()
File "D:\zjm_code\helloworld\views.py", line 33, in run
log.content=request.POST.get('content',None)
NameError: global name 'request' is not defined
Hello everybody,
I have two nested lists of different sizes:
A = [[1, 7, 3, 5], [5, 5, 14, 10]]
B = [[1, 17, 3, 5], [1487, 34, 14, 74], [1487, 34, 3, 87], [141, 25, 14, 10]]
I'd like to gather all nested lists from list B if A[2:4] == B[2:4] and put it into list L:
L = [[1, 17, 3, 5], [141, 25, 14, 10]]
Would you help me with this?
I'm using the following code to initialize logging in my application.
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
# log to a file
directory = '/reserved/DYPE/logfiles'
now = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = os.path.join(directory, 'dype_%s.log' % now)
file_handler = logging.FileHandler(filename)
file_handler.setLevel(logging.DEBUG)
formatter = logging.Formatter("%(asctime)s %(filename)s, %(lineno)d, %(funcName)s: %(message)s")
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)
# log to the console
console_handler = logging.StreamHandler()
level = logging.INFO
console_handler.setLevel(level)
logger.addHandler(console_handler)
logging.debug('logging initialized')
How can I close the current logging file and restart logging to a new file?
Note: I don't want to use RotatingFileHandler, because I want full control over all the filenames and the moment of rotation.
I'm trying to write a script to import a database file. I wrote the script to export the file like so:
import sqlite3
con = sqlite3.connect('../sqlite.db')
with open('../dump.sql', 'w') as f:
for line in con.iterdump():
f.write('%s\n' % line)
Now I want to be able to import that database. I tried:
import sqlite3
con = sqlite3.connect('../sqlite.db')
f = open('../dump.sql','r')
str = f.read()
con.execute(str)
but I'm not allowed to execute more than one statement. Is there a way to get it to run a .sql script directly?
I use the following code to log a map, it is fast when it only contains zeroes, but as soon as there is actual data in the map it becomes unbearably slow... Is there any way to do this faster?
log_file = open('testfile', 'w')
for i, x in ((i, start + i * interval) for i in range(length)):
log_file.write('%-5d %8.3f %13g %13g %13g %13g %13g %13g\n' % (i, x,
map[0][i], map[1][i], map[2][i], map[3][i], map[4][i], map[5][i]))