How do I split on all nonalphanumeric characters, EXCEPT the apostrophe?
re.split('\W+',text)
works, but will also split on apostrophes. How do I add an exception to this rule?
Thanks!
I want to filter elements from a list of lists, and iterate over the elements of each element using a lambda. For example, given the list:
a = [[1,2,3],[4,5,6]]
suppose that I want to keep only elements where the sum of the list is greater than N. I tried writing:
filter(lambda x, y, z: x + y + z >= N, a)
but I get the error:
<lambda>() takes exactly 3 arguments (1 given)
How can I iterate while assigning values of each element to x, y, and z? Something like zip, but for arbitrarily long lists.
thanks,
p.s. I know I can write this using: filter(lambda x: sum(x)..., a) but that's not the point, imagine that these were not numbers but arbitrary elements and I wanted to assign their values to variable names.
say ive got a matrix that looks like:
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
how can i make it on seperate lines:
[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]
and then remove commas etc:
0 0 0 0 0
And also to make it blank instead of 0's, so that numbers can be put in later, so in the end it will be like:
_ 1 2 _ 1 _ 1
(spaces not underscores)
thanks
The input list can be more than 1 million numbers. When I run the following code with smaller 'repeats', its fine;
def sample(x):
length = 1000000
new_array = random.sample((list(x)),length)
return (new_array)
def repeat_sample(x):
i = 0
repeats = 100
list_of_samples = []
for i in range(repeats):
list_of_samples.append(sample(x))
return(list_of_samples)
repeat_sample(large_array)
However, using high repeats such as the 100 above, results in MemoryError. Traceback is as follows;
Traceback (most recent call last):
File "C:\Python31\rnd.py", line 221, in <module>
STORED_REPEAT_SAMPLE = repeat_sample(STORED_ARRAY)
File "C:\Python31\rnd.py", line 129, in repeat_sample
list_of_samples.append(sample(x))
File "C:\Python31\rnd.py", line 121, in sample
new_array = random.sample((list(x)),length)
File "C:\Python31\lib\random.py", line 309, in sample
result = [None] * k
MemoryError
I am assuming I'm running out of memory. I do not know how to get around this problem.
Thank you for your time!
i wrote the code like this
import smtplib
server=smtplib.SMTP('localhost')
then it raising an error like
error: [Errno 10061] No connection could be made because the target machine actively refused it
i am new to SMTP can you tell what exactly the problem is
Hi,
I have some strings that I want to delete some unwanted characters from them.
For example: Adam'sApple ---- AdamsApple.(case insensitive)
Can someone help me, I need the fastest way to do it, cause I have a couple of millions of records that have to be polished.
Thanks
Trying to integrate openmeetings with django website, but can't understand how properly configure ImportDoctor:
(here :// replaced with __ 'cause spam protection)
print url
http://sovershenstvo.com.ua:5080/openmeetings/services/UserService?wsdl
imp = Import('http__schemas.xmlsoap.org/soap/encoding/')
imp.filter.add('http__services.axis.openmeetings.org')
imp.filter.add('http__basic.beans.hibernate.app.openmeetings.org/xsd')
imp.filter.add('http__basic.beans.data.app.openmeetings.org/xsd')
imp.filter.add('http__services.axis.openmeetings.org')
d = ImportDoctor(imp)
client = Client(url, doctor = d)
client.service.getSession()
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python2.6/site-packages/suds/client.py", line 539, in call
return client.invoke(args, kwargs)
File "/usr/lib/python2.6/site-packages/suds/client.py", line 598, in invoke
result = self.send(msg)
File "/usr/lib/python2.6/site-packages/suds/client.py", line 627, in send
result = self.succeeded(binding, reply.message)
File "/usr/lib/python2.6/site-packages/suds/client.py", line 659, in succeeded
r, p = binding.get_reply(self.method, reply)
File "/usr/lib/python2.6/site-packages/suds/bindings/binding.py", line 159, in get_reply
resolved = rtypes[0].resolve(nobuiltin=True)
File "/usr/lib/python2.6/site-packages/suds/xsd/sxbasic.py", line 63, in resolve
raise TypeNotFound(qref)
suds.TypeNotFound: Type not found: '(Sessiondata, http__basic.beans.hibernate.app.openmeetings.org/xsd, )'
what i'm doing wrong? please help and sorry for my english, but you are my last chance to save position :(
need webinars at morning (2.26 am now)
I am trying to search hindi words contained one line per file in file-1 and find them in lines in file-2. I have to print the line numbers with the number of words found.
This is the code:
import codecs
hypernyms = codecs.open("hindi_hypernym.txt", "r", "utf-8").readlines()
words = codecs.open("hypernyms_en2hi.txt", "r", "utf-8").readlines()
count_arr = []
for counter, line in enumerate(hypernyms):
count_arr.append(0)
for word in words:
if line.find(word) >=0:
count_arr[counter] +=1
for iterator, count in enumerate(count_arr):
if count>0:
print iterator, ' ', count
This is finding some words, but ignoring some others
The input files are:
File-1:
????
???????
File-2:
???????, ????-????
?????-???, ?????-???, ?????_???, ?????_???
????_????, ????-????, ???????_????
????-????
This gives output:
0 1
3 1
Clearly, it is ignoring ??????? and searching for ???? only. I have tried with other inputs as well. It only searches for one word. Any idea how to correct this?
I seem to have a hard time getting my head around this.
What's the difference between calendar.timegm() and time.mktime()?
Say I have a datetime.datetime with no tzinfo attached, shouldn't the two give the same output? Don't they both give the number of seconds between epoch and the date passed as a parameter? And since the date passed has no tzinfo, isn't that number of seconds the same?
>>> import calendar
>>> import time
>>> import datetime
>>> d = datetime.datetime(2010, 10, 10)
>>> calendar.timegm(d.timetuple())
1286668800
>>> time.mktime(d.timetuple())
1286640000.0
>>>
Is it possible a lambda function to have variable number of arguments?
For example, I want to write a metaclass, which creates a method for every method of some other class and this newly created method returns the opposite value of the original method and has the same number of arguments.
And I want to do this with lambda function. How to pass the arguments? Is it possible?
class Negate(type):
def __new__(mcs, name, bases, _dict):
extended_dict = _dict.copy()
for (k, v) in _dict.items():
if hasattr(v, '__call__'):
extended_dict["not_" + k] = lambda s, *args, **kw: not v(s, *args, **kw)
return type.__new__(mcs, name, bases, extended_dict)
class P(metaclass=Negate):
def __init__(self, a):
self.a = a
def yes(self):
return True
def maybe(self, you_can_chose):
return you_can_chose
But the result is totally wrong:
>>>p = P(0)
>>>p.yes()
True
>>>p.not_yes() # should be False
Traceback (most recent call last):
File "<pyshell#150>", line 1, in <module>
p.not_yes()
File "C:\Users\Nona\Desktop\p10.py", line 51, in <lambda>
extended_dict["not_" + k] = lambda s, *args, **kw: not v(s, *args, **kw)
TypeError: __init__() takes exactly 2 positional arguments (1 given)
>>>p.maybe(True)
True
>>>p.not_maybe(True) #should be False
True
I'd like to call a query with a field name filter that I wont know before run time... Not sure how to construct the variable name ...Or maybe I am tired.
field_name = funct()
locations = Locations.objects.filter(field_name__lte=arg1)
where if funct() returns name would equal to
locations = Locations.objects.filter(name__lte=arg1)
Not sure how to do that ...
Is there a more Pythonic way to put this loop together?:
while True:
children = tree.getChildren()
if not children:
break
tree = children[0]
UPDATE:
I think this syntax is probably what I'm going to go with:
while tree.getChildren():
tree = tree.getChildren()[0]
I have code that uses the BeautifulSoup library for parsing, but it is very slow. The code is written in such a way that threads cannot be used.
Can anyone help me with this?
I am using BeautifulSoup for parsing and than save into a DB. If I comment out the save statement, it still takes a long time, so there is no problem with the database.
def parse(self,text):
soup = BeautifulSoup(text)
arr = soup.findAll('tbody')
for i in range(0,len(arr)-1):
data=Data()
soup2 = BeautifulSoup(str(arr[i]))
arr2 = soup2.findAll('td')
c=0
for j in arr2:
if str(j).find("<a href=") > 0:
data.sourceURL = self.getAttributeValue(str(j),'<a href="')
else:
if c == 2:
data.Hits=j.renderContents()
#and few others...
c = c+1
data.save()
Any suggestions?
Note: I already ask this question here but that was closed due to incomplete information.
I have an array of files. I'd like to be able to break that array down into one array with multiple subarrays, each subarray contains files that were created on the same day. So right now if the array contains files from March 1 - March 31, I'd like to have an array with 31 subarrays (assuming there is at least 1 file for each day).
In the long run, I'm trying to find the file from each day with the latest creation/modification time. If there is a way to bundle that into the iterations that are required above to save some CPU cycles, that would be even more ideal. Then I'd have one flat array with 31 files, one for each day, for the latest file created on each individual day.
I'm looking for the most efficient way to add an element to a comma-separated string while maintaining alphabetical order for the words:
For example:
string = 'Apples, Bananas, Grapes, Oranges'
subtraction = 'Bananas'
result = 'Apples, Grapes, Oranges'
Also, a way to do this but while maintaining IDs:
string = '1:Apples, 4:Bananas, 6:Grapes, 23:Oranges'
subtraction = '4:Bananas'
result = '1:Apples, 6:Grapes, 23:Oranges'
Sample code is greatly appreciated. Thank you so much.
Any time I want to replace a piece of text that is part of a larger piece of text, I always have to do something like:
"(?P<start>some_pattern)(?P<replace>foo)(?P<end>end)"
And then concatenate the start group with the new data for replace and then the end group.
Is there a better method for this?
The tutorial on the django website shows this code for the models:
from django.db import models
class Poll(models.Model):
question = models.CharField(max_length=200)
pub_date = models.DateTimeField('date published')
class Choice(models.Model):
poll = models.ForeignKey(Poll)
choice = models.CharField(max_length=200)
votes = models.IntegerField()
Now, each of those attribute, is a class attribute, right? So, the same attribute should be shared by all instances of the class. A bit later, they present this code:
class Poll(models.Model):
# ...
def __unicode__(self):
return self.question
class Choice(models.Model):
# ...
def __unicode__(self):
return self.choice
How did they turn from class attributes into instance attributes? Did I get class attributes wrong?
I have a large graph that I am generating in matplotlib. I'd like to add a number of icons to this graph at certain (x,y) coordinates. I am wondering if there is any way to do that in matplotlib
Thank you
Here's the deal. I'm trying to write an arkanoid clone game and the thing is that I need a window menu like you get in pyGTK. For example File-(Open/Save/Exit) .. something like that and opening an "about" context where the author should be written.
I'm already using pyGame for writting the game logic. I've tried pgu to write the GUI but that doesn't help me, altough it has those menu elements I'm taking about, you can't include the screen of the game in it's container.
Does anybody know how to include such window menus with the usage of pyGame ?
First off, I'm relatively new to Google App Engine, so I'm probably doing something silly.
Say I've got a model Foo:
class Foo(db.Model):
name = db.StringProperty()
I want to use name as a unique key for every Foo object. How is this done?
When I want to get a specific Foo object, I currently query the datastore for all Foo objects with the target unique name, but queries are slow (plus it's a pain to ensure that name is unique when each new Foo is created).
There's got to be a better way to do this!
Thanks.
I'm trying to use reserved words in my grammar:
reserved = {
'if' : 'IF',
'then' : 'THEN',
'else' : 'ELSE',
'while' : 'WHILE',
}
tokens = [
'DEPT_CODE',
'COURSE_NUMBER',
'OR_CONJ',
'ID',
] + list(reserved.values())
t_DEPT_CODE = r'[A-Z]{2,}'
t_COURSE_NUMBER = r'[0-9]{4}'
t_OR_CONJ = r'or'
t_ignore = ' \t'
def t_ID(t):
r'[a-zA-Z_][a-zA-Z_0-9]*'
if t.value in reserved.values():
t.type = reserved[t.value]
return t
return None
However, the t_ID rule somehow swallows up DEPT_CODE and OR_CONJ. How can I get around this? I'd like those two to take higher precedence than the reserved words.
I know that I can dynamically add an instance method to an object by doing something like:
import types
def my_method(self):
# logic of method
# ...
# instance is some instance of some class
instance.my_method = types.MethodType(my_method, instance)
Later on I can call instance.my_method() and self will be bound correctly and everything works.
Now, my question: how to do the exact same thing to obtain the behavior that decorating the new method with @property would give?
I would guess something like:
instance.my_method = types.MethodType(my_method, instance)
instance.my_method = property(instance.my_method)
But, doing that instance.my_method returns a property object.
I'm looking for a way to compress an ascii-based string, any help?
I need also need to decompress it. I tried zlib but with no help.
What can I do to compress the string into lesser length?
code:
def compress(request):
if request.POST:
data = request.POST.get('input')
if is_ascii(data):
result = zlib.compress(data)
return render_to_response('index.html', {'result': result, 'input':data}, context_instance = RequestContext(request))
else:
result = "Error, the string is not ascii-based"
return render_to_response('index.html', {'result':result}, context_instance = RequestContext(request))
else:
return render_to_response('index.html', {}, context_instance = RequestContext(request))