Cleaning strings in R: add punctuation w/o overwriting last character

Posted by spearmint on Stack Overflow See other posts from Stack Overflow or by spearmint
Published on 2014-08-22T22:17:21Z Indexed on 2014/08/22 22:20 UTC
Read the original article Hit count: 312

Filed under:
|
|

I'm new to R and unable to find other threads with a similar issue.

I'm cleaning data that requires punctuation at the end of each line. I am unable to add, say, a period without overwriting the final character of the line preceding the carriage return + line feed.

Sample code:

Data1 <- "%trn: dads sheep\r\n*MOT: hunn.\r\n%trn: yes.\r\n*MOT: ana mu\r\n%trn: where is it?"
Data2 <- gsub("[^[:punct:]]\r\n\\*", ".\r\n\\*", Data1)

The contents of Data2:

[1] "%trn: dads shee.\r\n*MOT: hunn.\r\n%trn: yes.\r\n*MOT: ana mu\r\n%trn: where is it?"

Notice the "p" of sheep was overwritten with the period. Any thoughts on how I could avoid this?

© Stack Overflow or respective owner

Related posts about regex

Related posts about r