Removing punctuations from text using R

Here’s how I take your question, and an answer that is very close to @David Arenburg’s in the comment above.

 data <- '"I am a, new comer","to r,"please help","me:out","here"'
 gsub('[[:punct:] ]+',' ',data)
 [1] " I am a new comer to r please help me out here "

The extra space after [:punct:] is to add spaces to the string and the + matches one or more sequential items in the regular expression. This has the side effect, desirable in some cases, of shortening any sequence of spaces to a single space.

Leave a Comment