How do browsers/PHP handle characters outside the set characterset?

Posted by Maarten on Stack Overflow See other posts from Stack Overflow or by Maarten
Published on 2010-03-30T13:21:07Z Indexed on 2010/03/30 13:23 UTC
Read the original article Hit count: 262

Filed under:
|
|

I'm looking into how characters are handled that are outside of the set characterset for a page.

In this case the page is set to iso-8859-1, and the previous programmer decided to escape input using htmlentities($string,ENT_COMPAT). This is then stored into Latin1 tables in Mysql.

As the table is set to the same character set as the page, I am wondering if that htmlentities step is needed. I did some experiments on http://floris.workingweb.nl/experiments/characters.php and it seems that for stuff inside Latin1 some characters are escaped, but for example with a Czech name they are not.

Is this because those characters are outside of Latin1? If so, then the htmlentities can be removed, as it doesn't help for stuff outside of Latin1 anyway, and for within Latin1 it is not needed as far as I can see now...

© Stack Overflow or respective owner

Related posts about php

Related posts about character-encoding