Convert filenames to their checksum before saving to prevent duplicates. Is is a smart thing to do?

Posted by Xananax on Programmers See other posts from Programmers or by Xananax
Published on 2011-06-25T01:10:04Z Indexed on 2011/06/25 8:30 UTC
Read the original article Hit count: 334

Filed under:
|

TL;DR:what the title says


I am developing some sort of image board in PHP. I was thinking of changing each image's filename to it's checksum prior to saving it. This way, I might be able to prevent duplicates.
I know this wouldn't work for two images that are the same but differ in size or level of compression or whatnot, but this method would allow for an early check.
What bugs me is that I never saw this method implemented anywhere, so I was wondering if there is a catch to it. Maybe it is just more efficient to keep the original filename and store the hash in DB? Maybe the whole method is just not useful and my question is moot?
What do you think?

On a side note, I don't really get how hashes are calculated so I was wondering, if my first question checks out, if it would be possible to calculate the likeness that two images are similar by comparing hashes (levenshtein or something of the sort).

© Programmers or respective owner

Related posts about file-structure

Related posts about hashing