How to remove line breaks (or carriage returns) only from certain parts of a block of text?

Posted by Luke Allen on Super User See other posts from Super User or by Luke Allen
Published on 2012-09-02T06:16:05Z Indexed on 2012/09/02 9:41 UTC
Read the original article Hit count: 269

Filed under:
|

Whenever I copy formatted text from a PDF file which is formatted to have line breaks (or carriage returns), I need to find a way to remove these line breaks without removing the paragraph format.

To do this I need to use RegEx (Regular expressions) to only remove the line breaks which aren't preceded by a period.

So for example, if a string of text has a line break right after a period, that is obviously almost always a legitimate line break which will start a new paragraph. If a string of text has a line break mid-word or after a word with no period, it's simply part of the bad formatting I need to get rid of.

My problem is that I don't know how to use RegEx to make it only remove the ^p tags in word or CRLF or line breaks in any format under the conditions that it omits ones following a period.

© Super User or respective owner

Related posts about carriage-return

Related posts about linebreaks