Regex to ensure group match doesn't end with a specific character

Posted by AJ on Stack Overflow See other posts from Stack Overflow or by AJ
Published on 2010-05-19T15:11:30Z Indexed on 2010/05/19 15:20 UTC
Read the original article Hit count: 346

Filed under:
|

I'm having trouble coming up with a regular expression to match a particular case. I have a list of tv shows in about 4 formats:

  • Name.Of.Show.S01E01
  • Name.Of.Show.0101
  • Name.Of.Show.01x01
  • Name.Of.Show.101

What I want to match is the show name. My main problem is that my regex matches the name of the show with a preceding '.'. My regex is the following:

"^([0-9a-zA-Z\.]+)(S[0-9]{2}E[0-9]{2}|[0-9]{4}|[0-9]{2}x[0-9]{2}|[0-9]{3})"

Some Examples:

>>> import re

>>> SHOW_INFO = re.compile("^([0-9a-zA-Z\.]+)(S[0-9]{2}E[0-9]{2}|[0-9]{4}|[0-9]{2}x[0-9]{2}|[0-9]{3})")
>>> match = SHOW_INFO.match("Name.Of.Show.S01E01")
>>> match.groups()
('Name.Of.Show.', 'S01E01')
>>> match = SHOW_INFO.match("Name.Of.Show.0101")
>>> match.groups()
('Name.Of.Show.0', '101')
>>> match = SHOW_INFO.match("Name.Of.Show.01x01")
>>> match.groups()
('Name.Of.Show.', '01x01')
>>> match = SHOW_INFO.match("Name.Of.Show.101")
>>> match.groups()
('Name.Of.Show.', '101')

So the question is how do I avoid the first group ending with a period? I realize I could simply do:

var.strip(".")

However, that doesn't handle the case of "Name.Of.Show.0101". Is there a way I could improve the regex to handle that case better?

Thanks in advance.

© Stack Overflow or respective owner

Related posts about regex

Related posts about python