Encoding issue - 2nd band of ISO-8859-1 values do not get encoded?

Posted by bstack on Stack Overflow See other posts from Stack Overflow or by bstack
Published on 2010-03-30T13:21:03Z Indexed on 2010/03/30 13:23 UTC
Read the original article Hit count: 448

Filed under:
|

Hello,

I want to send the pound sign character i.e. '£' encoded as ISO-8859-1 across the wire. I perform this by doing the following:

var _encoding = Encoding.GetEncoding("iso-8859-1"); 
var _requestContent = _encoding.GetBytes(requestContent); 
var _request = (HttpWebRequest)WebRequest.Create(target); 

_request.Headers[HttpRequestHeader.ContentEncoding] = _encoding.WebName; 
_request.Method = "POST"; 
_request.ContentType = "application/x-www-form-urlencoded; charset=iso-8859-1"; 
_request.ContentLength = _requestContent.Length; 

_requestStream = _request.GetRequestStream(); 
_requestStream.Write(_requestContent, 0, _requestContent.Length); 
_requestStream.Flush(); 
_requestStream.Close(); 

When I put a breakpoint at the target, I expect to receive the following: '%a3', however I receive '%u00a3' instead. ISO-8859-1 is divided into 2 groups of characters: (ref: http://en.wikipedia.org/wiki/ISO_8859-1)

The lower range 20 to 7E - is where all characters seem to be encoded correctly The higher range A0 to FF - is where all characters seem to encode to their Unicode equivalent value

As '£' is in higher range A0 to FF, it gets encoded to %u00a3. In fact when I use the first few characters of the higher range A0 to FF i.e. '¡¢£¤¥¦§¨©ª«¬®', I get '%u00a1%u00a2%u00a3%u00a4%u00a5%u00a6%u00a7%u00a8%u00a9%u00aa%u00ab%u00ac%u00ae'. This behaviour is consistent.

The question I have is why do characters in the higher range A0 to FF get encoded to their unicode value - and not to their equivalent ISO-8859-1 value?

Help would be greatly appreciated...

Billy

© Stack Overflow or respective owner

Related posts about encoding

Related posts about iso-8859-1