Storing large numbers of varying size objects on disk

Posted by Foredecker on Stack Overflow See other posts from Stack Overflow or by Foredecker
Published on 2008-11-04T04:30:30Z Indexed on 2010/04/06 21:33 UTC
Read the original article Hit count: 158

Filed under:
|

I need to develop a system for storing large numbers (10's to 100's of thousands) of objects. Each object is email-like - there is a main text body, and several ancillary text fields of limited size. A body will be from a few bytes, to several KB in size.

Each item will have a single unique ID (probably a GUID) that identifies it.

The store will only be written to when an object is added to it. It will be read often. Deletions will be rare. The data is almost all human readable text so it will be readily compressible.

A system that lets me issue the I/Os and mange the memory and caching would be ideal.

I'm going to keep the indexes in memory, using it to map indexes to the single (and primary) key for the objects. Once I have the key, then I'll load it from disk, or the cache.

The data management system needs to be part of my application - I do not want to depend on OS services. Or separately installed packages. Native (C++) would be best, but a manged (C#) thing would be ok.

I believe that a database is an obvious choice, but this needs to be super-fast for look up and loading into memory of an object. I am not experienced with data base tech and I'm concerned that general relational systems will not handle all this variable sized data efficiently.

(Note, this has nothing to do with my job - its a personal project.)

In your experience, what are the viable alternatives to a traditional relational DB? Or would a DB work well for this?

© Stack Overflow or respective owner

Related posts about data-structures

Related posts about storage