Where can I find an array of the unassigned Unicode code points for a particular block?

Posted by gitparade on Stack Overflow See other posts from Stack Overflow or by gitparade
Published on 2010-05-22T13:52:44Z Indexed on 2010/05/23 11:20 UTC
Read the original article Hit count: 161

Filed under:
|

At the moment, I'm writing these arrays by hand.

For example, the Miscellaneous Mathematical Symbols-A block has an entry in hash like this:

my %symbols = (
    ...
    miscellaneous_mathematical_symbols_a => [(0x27C0..0x27CA), 0x27CC,
        (0x27D0..0x27EF)],
    ...
)

The simpler, 'continuous' array

miscellaneous_mathematical_symbols_a => [0x27C0..0x27EF]

doesn't work because Unicode blocks have holes in them. For example, there's nothing at 0x27CB. Take a look at the code chart [PDF].

Writing these arrays by hand is tedious, error-prone and a bit fun. And I get the feeling that someone has already tackled this in Perl!

© Stack Overflow or respective owner

Related posts about perl

Related posts about unicode