Converting regex statment for sentance extraction to Ruby
- by DavidP6
I found this regex statement on the wiki (http://en.wikipedia.org/wiki/Sentence_boundary_disambiguation) for Sentence boundary disambiguation, but am not able to use it in a Ruby split statment. I'm not too good with regex so maybe I am missing something? This is statment:
((?<=[a-z0-9)][.?!])|(?<=[a-z0-9][.?!]\"))(\s|\r\n)(?=\"?[A-Z])
and this is what I tried in Ruby, but no go:
text.split("((?<=[a-z0-9)][.?!])|(?<=[a-z0-9][.?!]\"))(\s|\r\n)(?=\"?[A-Z])")