Why use hashing to create pathnames for large collections of files?

Posted by Stephen on Stack Overflow See other posts from Stack Overflow or by Stephen
Published on 2008-12-03T21:56:48Z Indexed on 2010/03/29 14:53 UTC
Read the original article Hit count: 248

Hi, I noticed a number of cases where an application or database stored collections of files/blobs using a has to determine the path and filename. I believe the intended outcome is a situation where the path never gets too deep, or the folders ever get too full - too many files (or folders) in a folder making for slower access.

EDIT: Examples are often Digital libraries or repositories, though the simplest example I can think of (that can be installed in about 30s) is the Zotero document/citation database.

Why do this?

EDIT: thanks Mat for the answer - does this technique of using a hash to create a file path have a name? Is it a pattern? I'd like to read more, but have failed to find anything in the ACM Digital Library

© Stack Overflow or respective owner

Related posts about data-structures

Related posts about database-design