Why are there so many spaces and line breaks in Unicode?
        Posted  
        
            by 
                maaartinus
            
        on Programmers
        
        See other posts from Programmers
        
            or by maaartinus
        
        
        
        Published on 2011-01-30T01:12:49Z
        Indexed on 
            2011/01/30
            7:31 UTC
        
        
        Read the original article
        Hit count: 628
        
unicode
Unicode has maybe 50 spaces
\u0009\u000A-\u000D\u0020\u0085\u00A0\u1680\u180E\u2000-\u200A\u2028\u2029\u202F\u205F\u3000][\u0009\u000A-\u000D\u0020\u0085\u00A0\u1680\u180E\u2000-\u200A\u2028\u2029\u202F\u205F\u3000
and 6 line breaks
not only CRLF, LF, CR, but also NEL (U+0085), PS (U+2029) and LS (U+2028).
Maybe I could understand most of the spaces and PS ("Paragraph separator"), but what are "Next Line" and "Line separator" good for?
It all looks like invented by a very big committee where everybody wanted their own space and the leaders were granted one line break each. But seriously, how do you deal with it when your programming language doesn't support it (or does it wrong as e.g. Java does)?
© Programmers or respective owner