Efficient way to ASCII encode UTF-8
        Posted  
        
            by Andreas Gohr
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Andreas Gohr
        
        
        
        Published on 2010-04-02T14:59:00Z
        Indexed on 
            2010/04/02
            15:03 UTC
        
        
        Read the original article
        Hit count: 365
        
I'm looking for a simple and efficient way to store UTF-8 strings in ASCII-7. With efficient I mean the following:
- all ASCII chars in the input should stay ASCII chars in the output
- the resulting string should be as short as possible
- the operation needs to be reversable without any data loss
- there should be no restriction on the input length
- the whole UTF-8 range should be allowed
My first idea was to use Punycode (IDNA) as it fits the first three requirements, but it fails at the last two.
Can anyone recommend an alternative encoding scheme? Even better if there's some code available to look at.
© Stack Overflow or respective owner