Unicode replacement characters for text matching

Posted by Christian Harms on Stack Overflow See other posts from Stack Overflow or by Christian Harms
Published on 2010-06-06T21:08:12Z Indexed on 2010/06/06 21:12 UTC
Read the original article Hit count: 385

Filed under:
|

I have some fun with unicode text sources (all correct encodet) and I want to match names. The classic problem, one source comes correctly, an other has more flatten names:

"Elblag" vs. "Elblag" (see the character a)

How can I "flatten" a, á, â or à to a for better matching? Are there unicode to ascii- matching tables?

© Stack Overflow or respective owner

Related posts about unicode

Related posts about special-characters