Can this way of storing typed objects be improved?

Posted by Pindatjuh on Stack Overflow See other posts from Stack Overflow or by Pindatjuh
Published on 2010-04-04T20:11:56Z Indexed on 2010/04/08 18:23 UTC
Read the original article Hit count: 238

Filed under:
|

This is an "can it be improved"-question.

Topic: Storing typed objects in memory.

Background information: I'm building a compiler for the x86-32 Windows platform for my language. My goal includes typed objects.

Idea: Every primitive is a semi-class (it can be used as if it was a normal class, but it's stored more compact). Every class is represented by primitives and some meta-data (containing class-properties, inheritance stuff, etc.). The meta-data is complex: it doesn't use fields but instead context-switches. For primitives, the meta-data is very small, compared to a "real" class, which is alot bigger. This enables another idea that "primitives are objects", in my language, which I found nessecairy.

Example: If I have an array of 32 booleans, then the pure content of this array is exactly 4 byte (32 bits of booleans). The meta-data will contain flags that the type is an array of booleans, which contains 32 entries.

The meta-data is very compacted, on bit-level: using a sort of "packing" mechanism, which is read by a FSM at runtime, when doing inspection of the type (like when passing the object to methods for checking, etc.) For instance (read from left to right, top to bottom, remember vertical position when going to the right, and check nearest column header for meaning of switch):

Primitive? Array? Type-Meta 1 Byte?  || Size (1 byte)
1          1      [...]     1           [...] done
                            0        2 Bytes? || Size (2 bytes)
                                     1           [...] done
                                              || Size (4 bytes)
                                     0           [...] done
                  Integer?  1 Byte?  2 Bytes?
           0      1         0        1 done
                            1 done   0 done
                            Boolean? Byte?
                  0         1        0 done
                                     1 done
                                     More-Primitives
                            0        ....
           Class-Stuff (Huge)
 0         ...

(After reaching done the data is inserted. || = byte alignment. [...] is variable sized. ... is not described here, for simplicity. And let's call them cost-based-data-structures.)

For an array of 32 booleans containing all true values, the memory for this type would be (read top-down):

1                                   Primitive
1                                   Array
1                                   ArrayType: Primitive
0                                              Not-Array
0                                              Not-Integer
1                                              Boolean
0                                              Not-Byte (thus bit)
1                                   Integer Size: 1 Byte
00100000                            Array size
01010101 01010101 01010101 01010101 Data (user defined)

Thus, 8 bytes represent 32 booleans in an array:

11100101 00100000 01010101 01010101 01010101 01010101

How can I improve this? (Both performance- and memory-consumption wise)

© Stack Overflow or respective owner

Related posts about data-structure

Related posts about bit-packing