Optimal Serialization of Primitive Types
- by Greg Dean
We are beginning to roll out more and more WAN deployments of our product (.Net fat client w/ IIS hosted Remoting backend). Because of this we are trying to reduce the size of the data on the wire.
We have overridden the default serialization by implementing ISerializable (similar to this), we are seeing anywhere from 12% to 50% gains. Most of our efforts focus on optimizing arrays of primitive types. I would like to know if anyone knows of any fancy way of serializing primitive types, beyond the obvious?
For example today we serialize an array of ints as follows:
[4-bytes (array length)][4-bytes][4-bytes]
Can anyone do significantly better?
The most obvious example of a significant improvement, for boolean arrays, is putting 8 bools in each byte, which we already do.
Note: Saving 7 bits per bool may seem like a waste of time, but when you are dealing with large magnitudes of data (which we are), it adds up very fast.
Note: We want to avoid general compression algorithms because of the latency associated with it. Remoting only supports buffered requests/responses(no chunked encoding). I realize there is a fine line between compression and optimal serialization, but our tests indicate we can afford very specific serialization optimizations at very little cost in latency. Whereas reprocessing the entire buffered response into new compressed buffer is too expensive.