Converting regex statment for sentance extraction to Ruby

Posted by DavidP6 on Stack Overflow See other posts from Stack Overflow or by DavidP6
Published on 2010-05-01T08:00:11Z Indexed on 2010/05/01 8:07 UTC
Read the original article Hit count: 414

Filed under:
|

I found this regex statement on the wiki (http://en.wikipedia.org/wiki/Sentence_boundary_disambiguation) for Sentence boundary disambiguation, but am not able to use it in a Ruby split statment. I'm not too good with regex so maybe I am missing something? This is statment:

((?<=[a-z0-9)][.?!])|(?<=[a-z0-9][.?!]\"))(\s|\r\n)(?=\"?[A-Z])

and this is what I tried in Ruby, but no go:

text.split("((?<=[a-z0-9)][.?!])|(?<=[a-z0-9][.?!]\"))(\s|\r\n)(?=\"?[A-Z])")

© Stack Overflow or respective owner

Related posts about regex

Related posts about ruby