Nested Groups in Regex

Posted by cryptic-star on Stack Overflow See other posts from Stack Overflow or by cryptic-star
Published on 2010-04-14T21:05:01Z Indexed on 2010/04/14 21:23 UTC
Read the original article Hit count: 360

Filed under:
|

I'm constructing a regex that is looking for dates. I would like to return the date found and the sentence it was found in. In the code below, the strings on either side of date_string should check for the conditions of a sentence. For your sake, I've omitted the regex for date_string - sufficed to say, it works for picking out dates. While the inside of date_string isn't important, it is grouped as one entire regex.

"((?:[^.|?|!]*)"+date_string+"(?:[^.|?|!]*[.|?|!]\s*))"

The problem is that date_string is only matching the last number of any given date, presumably because the regex in front of date_string is matching too far and overrunning the date regex. For example, if I say "Independence Day is July 4.", I will get the sentence and 4, even though it should match 'July 4'. In case you're wondering, my regex inside date_string are ordered in such a way that 'July 4' should match first. Is there any way to do this all in one regex? Or do I need to split it up somehow (i.e. split up all text into sentences, and then check each sentence)?

© Stack Overflow or respective owner

Related posts about regex

Related posts about nested