Unicode replacement characters for text matching
Posted
by Christian Harms
on Stack Overflow
See other posts from Stack Overflow
or by Christian Harms
Published on 2010-06-06T21:08:12Z
Indexed on
2010/06/06
21:12 UTC
Read the original article
Hit count: 385
unicode
|special-characters
I have some fun with unicode text sources (all correct encodet) and I want to match names. The classic problem, one source comes correctly, an other has more flatten names:
"Elblag" vs. "Elblag" (see the character a)
How can I "flatten" a, á, â or à to a for better matching? Are there unicode to ascii- matching tables?
© Stack Overflow or respective owner