Best way to correct garbled data caused by false encoding

Posted by ercan on Stack Overflow See other posts from Stack Overflow or by ercan
Published on 2010-03-09T14:59:03Z Indexed on 2010/03/14 10:25 UTC
Read the original article Hit count: 257

Filed under:

encoding

|

sql-server-2005

|

string-manipulation

|

regex

Hi all,

I have a set of data that contains garbled text fields because of encoding errors during many import/exports from one database to another. Most of the errors were caused by converting UTF-8 to ISO-8859-1. Strangely enough, the errors are not consistent: the word 'München' appears as 'MÃ¼nchen' in some place and as 'MÃœnchen'.

Is there a trick in SQL server to correct this kind of crap? The first thing that I can think of is to exploit the COLLATE clause, so that Ã¼ is interpreted as ü, but I don't exactly know how. If it isn't possible to make it in the DB level, do you know any tool that helps for a bulk correction? (no manual find/replace tool, but a tool that guesses the garbled text somehow and correct them)

© Stack Overflow or respective owner

Related posts about encoding

<?xml version=“1.0” encoding=“UTF-8”?> not <?xml version='1.0' encoding='UTF-8'?>

as seen on Stack Overflow - Search for 'Stack Overflow'
I am using lxml with tree.write(xmlFileOut, pretty_print = True, xml_declaration = True, encoding='UTF-8' to write out my opened and edited xml file, but I absolutely need to have the xml declaration as <?xml version=“1.0” encoding=“UTF-8”?> and NOT <?xml version='1.0' encoding='UTF-8'… >>> More
Ivar definitions show 'long' type encoding as 'long long' type encoding

as seen on Stack Overflow - Search for 'Stack Overflow'
I've found what I think may be a bug with Ivar and Objective-C runtime. I'm using XCode 3.2.1 and associated libraries, developing a 64 bit app on X86_64 (MacBook Pro). Where I would expect the type encoding for the following "longVal" to be 'l', the Ivar encoding is showing a 'q' (which is a 'long… >>> More
How to avoid encoding the key of request parameters being encoding

as seen on Stack Overflow - Search for 'Stack Overflow'
I'm trying to send a http request using WS.url() with a action receive a custom class parameter like public static void add(@Valid MyPage info) {...} There is a Map in MyPage @Required public Map<String, String> content = new HashMap<String, String>(); But When I try to send a request… >>> More
C# Check if character exists in encoding

as seen on Stack Overflow - Search for 'Stack Overflow'
I am writing a program that a part renders a bitmap font in CP437. In a function that renders the text with I want to be able to check whether a char is available in CP437 before the encoding conversion, like: public static void DrawCharacter(this Graphics g, char c) { if (char_exist_in_encoding(Encoding… >>> More
How to detect the character encoding of a text file?

as seen on Stack Overflow - Search for 'Stack Overflow'
I try to detect which character encoding is used in my file. I try with this code to get the standard encoding public static Encoding GetFileEncoding(string srcFile) { // *** Use Default of Encoding.Default (Ansi CodePage) Encoding enc = Encoding.Default; // *** Detect byte… >>> More

Related posts about sql-server-2005

[MAJ] SQL Server 2005 Express Edition - La version gratuite de SQL Server 2005

as seen on ASP-PHP.net - Search for 'ASP-PHP.net'
Modification technique pour le site de l'article. >>> More
SQLAuthority News – Feature Pack for Microsoft SQL Server 2005 SP4

as seen on SQL Authority - Search for 'SQL Authority'
If you are still using SQL Server 2005 – I suggest that you consider migrating to later version of the SQL Server 2008/2008 R2. Due to any reason, you wanted to continue using SQL Server 2005, I suggest that you take a look at the Feature Pack for Microsoft SQL Server 2005 SP4. There are many… >>> More
SQL Server 2005 - Syncing development/production databases

as seen on Stack Overflow - Search for 'Stack Overflow'
I've got a rather large SQL Server 2005 database that is under constant development. Every so often, I either get a new developer or need to deploy wide-scale schema changes to the production server. My main concern is deploying schema + data updates to developer machines from the "master" development… >>> More
Displaying the Row Number in a SELECT Query with SQL Server 2005

as seen on Internet.com - Search for 'Internet.com'
SQL Server 2005 introduced a function that is very handy when you want the result set to have row numbers assigned to each returned row. >>> More
Displaying the Row Number in a SELECT Query with SQL Server 2005

as seen on Internet.com - Search for 'Internet.com'
SQL Server 2005 introduced a function that is very handy when you want the result set to have row numbers assigned to each returned row. >>> More