Sanitize a string from ascii art

Posted by Toto on Stack Overflow See other posts from Stack Overflow or by Toto
Published on 2010-03-28T10:07:32Z Indexed on 2010/03/28 10:13 UTC
Read the original article Hit count: 255

Filed under:
|

I need to sanitize article titles when (creative) users try to "attract attention" with some bad "ascii art".

Exemples:

  • Buy my product !!!!!!!!!!!!!!!!!!!!!!!!
  • Buy my product !? !? !? !? !? !?
  • Buy my product !!!!!!!!!.......!!!!!!!!
  • Buy my product <-----------

Some acceptable solution would be to reduce the repetition of non-alphanum to 2.

So I would get:

  • Buy my product !!
  • Buy my product !? !?
  • Buy my product !!..!!
  • Buy my product <--

This solution did not work that well:

preg_replace('/(\W{2,})(?=\1+)/', '', $title)

Any idea how to do it in PHP with regex?

Other better solution is also welcomed (I cannot strip all the non-alphanum characters as they can make sense).

© Stack Overflow or respective owner

Related posts about regex

Related posts about php