charsets in MySQL replication

Posted by niklassaers on Stack Overflow See other posts from Stack Overflow or by niklassaers
Published on 2010-06-15T08:20:26Z Indexed on 2010/06/15 8:22 UTC
Read the original article Hit count: 247

Hi guys,

What can I do to ensure that replication will use latin1 instead of utf-8?

I'm migrating between an MySQL 5.1.22 server (master) on a Linux system and a MySQL 5.1.42 server (slave) on a FreeBSD system. My replication works well, but when non-ascii characters are in my varchars, they turn "weird". The Linux/MySQL-5.1.22 shows the following character set variables:

character_set_client=latin1
character_set_connection=latin1
character_set_database=latin1
character_set_filesystem=binary
character_set_results=latin1
character_set_server=latin1
character_set_system=utf8
character_sets_dir=/usr/share/mysql/charsets/
collation_connection=latin1_swedish_ci
collation_database=latin1_swedish_ci
collation_server=latin1_swedish_ci

While the FreeBSD shows

character_set_client=utf8
character_set_connection=utf8
character_set_database=utf8
character_set_filesystem=binary
character_set_results=utf8
character_set_server=utf8
character_set_system=utf8
character_sets_dir=/usr/local/share/mysql/charsets/
collation_connection=utf8_general_ci
collation_database=utf8_general_ci
collation_server=utf8_general_ci

Setting any of these variables from the MySQL CLI has no effect, and setting them in my.cnf or at the command line makes the server not start.

Of course, both servers have the tables in question created the same way, in this case with DEFAULT CHARSET=latin1. Let me give you an example:

CREATE TABLE `test` (
  `test` varchar(5) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1

When I on the master do, in a Latin1 terminal, "INSERT INTO test VALUES ('æøå')", this becomes on the slave, when I select it from a Latin1 based terminal

+--------+
| test   |
+--------+
| æøå    |
+--------+

On a UTF-8 based terminal on the replication slave, test contains:

+--------+
| test   |
+--------+
| æøå    |
+--------+

So my conclusion is that it is converted to utf8, even though the table definition is latin1. Is this a correct conclusion?

Of course, on the master, in a latin1 terminal, it still says:

+------+
| test |
+------+
| æøå  | 
+------+

Since both system character sets are utf-8, if I set both terminals to utf-8 and do again "INSERT INTO test VALUES ('æøå')" on the master with a utf-8 terminal, on the slave with utf-8 I get:

+------------+
| test       |
+------------+
| æøà     |
+------------+

If my conclusion is correct, all my replicated data is converted to utf8 (if it is utf8, it is treated as latin1 and converted to utf8), while all the old data in the table is, as the CREATE TABLE suggests, latin1. I'd love to convert it all to utf-8 if it weren't for the fact that legacy applications rely on it being latin1, so I need to keep it in latin1 while they still exist.

What can I do to ensure that the replication reads latin1, treats it as latin1 and writes it on the slave as latin1?

Cheers

Nik

© Stack Overflow or respective owner

Related posts about mysql

Related posts about utf-8