Why can't I use accented characters next to a word boundary?

Posted by Rexxars on Stack Overflow See other posts from Stack Overflow or by Rexxars
Published on 2010-03-15T19:15:48Z Indexed on 2010/03/15 19:19 UTC
Read the original article Hit count: 360

Filed under:
|
|
|

I'm trying to make a dynamic regex that matches a persons name. It works without problems on most names, until I ran into accented characters at the end of the name.

Example: Some Fancy Namé

The regex I've used so far is:

/\b(Fancy Namé|Namé)\b/i

Used like this:

"Goal: Some Fancy Namé. Awesome.".replace(/\b(Fancy Namé|Namé)\b/i, '<a href="#">$1</a>');

This simply won't match. If I replace the é with a e, it matches just fine. If I try to match a name such as "Some Fancy Naméa", it works just fine. If I remove the word last word boundary anchor, it works just fine.

Why doesn't the word boundary flag work here? Any suggestions on how I would get around this problem?

I have concidered using something like this, but I'm not sure what the performance penalties would be like:

"Some fancy namé. Allow me to ellaborate.".replace(/([\s.,!?])(fancy namé|namé)([\s.,!?]|$)/g, '$1<a href="#">$2</a>$3')

Suggestions? Ideas?

© Stack Overflow or respective owner

Related posts about JavaScript

Related posts about regex