In utf-8 collation, why 11- is less then 1- ?
Posted
by
???
on Super User
See other posts from Super User
or by ???
Published on 2011-01-01T13:32:38Z
Indexed on
2011/01/01
13:55 UTC
Read the original article
Hit count: 187
I found that the sort result in ASCII:
1-
11-
and in UTF-8:
11-
1-
I feel it's so counter-intuitive, and it's not dictionary order.
Isn't the character '-' (002d
) is always less then [0-9]
(0030-0039
)?
What's the general rule in UTF-8 collation?
And how to bypass it, just make -
be less then [0-9]
while keep other characters unchanged for UTF-8, in Linux? (So it can affects the result of ls --sort
, sort
, etc. )
© Super User or respective owner