Arithmetic Coding Questions

Posted by Xophmeister on Programmers See other posts from Programmers or by Xophmeister
Published on 2012-12-17T12:31:43Z Indexed on 2012/12/17 17:13 UTC
Read the original article Hit count: 478

Filed under:
|
|

I have been reading up on arithmetic coding and, while I understand how it works, all the guides and instructions I've read start with something like:

Set up your intervals based upon the frequency of symbols in your data; i.e., more likely symbols get proportionally larger intervals.

My main query is, once I have encoded my data, presumably I also need to include this statistical model with the encoding, otherwise the compressed data can't be decoded. Is that correct? I don't see this mentioned anywhere -- the most I've seen is that you need to include the number of iterations (i.e., encoded symbols) -- but unless I'm missing something, this also seems necessary to me.

If this is true, that will obviously add an overhead to the final output. At what point does this outweigh the benefits of compression (e.g., say if I'm trying to compress just a few thousand bits)? Will the choice of symbol size also make a significant difference (e.g., if I'm looking at 2-bit words, rather than full octets/whatever)?

© Programmers or respective owner

Related posts about coding

Related posts about compression