How to diagnose, and reverse (not prevent) Unicode mangling

Posted by Steve Bennett on Stack Overflow See other posts from Stack Overflow or by Steve Bennett
Published on 2010-06-02T05:40:39Z Indexed on 2010/06/02 5:43 UTC
Read the original article Hit count: 184

Somewhere upstream of me, "something" happened that looks like unicode mangling. One symptom is that a lowercase u umlaut (ü) gets converted to "ü" (ie, character FC gets converted to C3 BC). Assuming that I have no control over this upstream process, how can I reverse-engineer what's going on? And if that is possible, can I crank the sausage machine backwards and get the original text back?

(If it helps to understand this case, the text I received was in the form of a MySQL dump. I think somwewhere in the dump/transport process it got mangled.)

© Stack Overflow or respective owner

Related posts about unicode

Related posts about strings