Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Most people. I haven't been part of that group, for like, ever maybe?

If we're a group of devs with a not insignificant percentage of those devs being frontend/UI/UX types, then having the same image in multiple sizes, formats, etc is going to be pretty common. Looking for multiples of the exact file is only going to reduce so much. Knowing you have a library of images with a source and then all of the derivatives is going to get you a lot less files as long as you know you have the source, then running image based sameness is much more beneficial. Sure, this is niche territory, but yeah, and, so?

Maybe there's someone new(-ish) that hasn't really had to deal with cleaning up thousands of images to this extent. One would hope the same image in its various forms within a dev's env would be similarly named, but that's not guaranteed. If we could depend on filenames, we wouldn't need hashing, right?



In some cases deduplication happens at the file system layer transparently without you even realizing it. E.g. there are tools like https://github.com/lakshmipathi/dduper

I agree that image editing workflows are a different use case more suited to perceptual hashes than cryptographic hashes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: