In utf-8 collation, why 11- is less then 1- ?
Posted
by
???
on Super User
See other posts from Super User
or by ???
Published on 2011-01-01T13:32:38Z
Indexed on
2011/01/01
13:55 UTC
Read the original article
Hit count: 262
I found that the sort result in ASCII:
1-
11-
and in UTF-8:
11-
1-
I feel it's so counter-intuitive, and it's not dictionary order.
Isn't the character '-' (002d) is always less then [0-9] (0030-0039)?
What's the general rule in UTF-8 collation?
And how to bypass it, just make - be less then [0-9] while keep other characters unchanged for UTF-8, in Linux? (So it can affects the result of ls --sort, sort, etc. )
© Super User or respective owner