regex in python, can this be improved upon?

Posted by tipu on Stack Overflow See other posts from Stack Overflow or by tipu
Published on 2010-06-02T19:31:05Z Indexed on 2010/06/02 19:34 UTC
Read the original article Hit count: 136

Filed under:
|

I have this piece of code that finds words that begin with @ or #,

p = re.findall(r'@\w+|#\w+', str)

Now what irks me about this is repeating \w+. I am sure there is a way to do something like

p = re.findall(r'(@|#)\w+', str)

That will produce the same result but it doesn't, it instead returns only # and @. How can that regex be changed so that I am not repeating the \w+? This code comes close,

p = re.findall(r'((@|#)\w+)', str)

But it returns [('@many', '@'), ('@this', '@'), ('#tweet', '#')] (notice the extra '@', '@', and '#'.

© Stack Overflow or respective owner

Related posts about python

Related posts about regex