Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Git no longer uses SHA-1. It instead uses a variant called SHA-1DC that detects some known problems, and in those cases returns a different answer. More info: <https://github.com/cr-marcstevens/sha1collisiondetection>. Git switched to SHA-1DC in its version 2.13 release in 2017. It's a decent stopgap but not a grrat long term solution.

There is also work to support SHA-256, though that seems to have stalled: https://lwn.net/Articles/898522/

The fundamental problem is that get developers assumed that hash algorithms would never be changed, and that was a ridiculous assumption. It's much wiser to implement crypto agility.



> The fundamental problem is that get developers assumed that hash algorithms would never be changed, and that was a ridiculous assumption. It's much wiser to implement crypto agility.

Cryptographic agility makes this problem worse, not better: instead of having a "flag day" (or release) where `git`'s digest choice reflects the State of the Art, agility ensures that every future version of `git` can be downgraded to a broken digest.


That's the general anti-agility argument wielded against git, but note that git's use cases require it to process historic data.

E.g. you will want to be able to read some sha-1-only repo from disk that was last touched a decade ago. That's a different thing than some protocol which requires both parties to be on-line, say wireguard, in which instance it's easier to switch both to a new version that uses a different cryptographic algorithm.

Git has such protocols as well, and maybe it can deprecate sha-1 support there eventually, but even there it has to support both sha-1 and sha-2 for a while because not everyone is using the latest and greatest version of git, and no sysadmin wants the absolute horror of flag days.


It would be safer to forbid broken hashes after certain date, and consider only those earlier hashes that have been counter-signed by new algorithms.


So then you can’t load an archived repo?


Assuming reasonable logic around hashes, like "a SHA-2 commit can't be a parent of a SHA-1 commit", there wouldn't much in the way of downgrade attacks available.


Wow, smart! This would keep all the old history intact and at the same time force lots of people to upgrade through social pressure. I'd probably be angry as hell when that happened to me, but it would also work.


FTR the current plan for git's migration is that commits have both SHA-1 and SHA-2 addresses, and you can reference them by both. There is thus no concept of "SHA-2 commit", or "SHA-1 commit". The issue is more around pointers that are not directly managed by git, e.g. hashes inside commit messages to reference an earlier commit (and of course signatures). Those might require a git filter-repo - like step that breaks the SHA-1 hashes (and signatures) to migrate to SHA-2, if that is desired.


SHA-1 was already known to be broken at the time Git chose it, but they chose it anyway. Choosing a non-broken algorithm like SHA-2 was an easy choice they could have made that would still hold up today. Implementing a crypto agility system is not without major trade-offs (consider how common downgrade attacks have been across protocols!).


> Choosing a non-broken algorithm like SHA-2 was an easy choice they could have made that would still hold up today.

Yet, the requirement of the hashing algorithm for Git is not broken, it's not cryptographic but merely stochastic, and Linus knows this.

Why bother to produce a collision, when you have the power to get your changes pulled into a release branch? Your attack might be noticed, and your cover blown.

Instead, simply try to get a bug merged that results in a zero day. In case somebody discovers it, at least you have plausible deniability that it happened on accident.


>SHA-1 was already known to be broken at the time Git chose it

Please pardon my ignorance but could you elaborate on what time (e.g. the year) are you referring to?


Since about 2005, collision attacks against SHA-1 have been known. In 2005 Linus dismissed these concerns as impractical, writing:

    > The basic attack goes like this:
    >
    > - I construct two .c files with identical hashes.
    
    Ok, I have a better plan.
    
    - you learn to fly by flapping your arms fast enough
    - you then learn to pee burning gasoline
    - then, you fly around New York, setting everybody you see on fire, until
    people make you emperor.
    
    Sounds like a good plan, no?
    
    But perhaps slightly impractical.
    
    Now, let's go back to your plan. Why do you think your plan is any better
    than mine?
https://git.vger.kernel.narkive.com/9lgv36un/zooko-zooko-com...


This is a really good example of Torvalds toxic attitude and absolutely horrific attitude towards security. This is an occurring pattern unfortunately.

Git not being prepared for this is going to cost a lot of time and money for a very large amount of people, and it could have been trivially mitigated if security were taken seriously in the first place, and if Torvalds was mature enough to understand the he is not an expert on cryptography topics.


I didn't know either. From Wikipedia [1], SHA-1 has been considered insecure to some degree since 2005. Following the citations, apparently it's been known since at least August 2004 [2] but maybe not demonstrated in SHA-1 until early 2005.

git's first release was in 2005, so I guess technically SHA-1 issues could've been known or suspected during development time.

More generously, it could've been somewhat simultaneous. It sounds like it was considered a state-sponsored level attack at the time, if collisions were even going to be possible. Don't know if the git devs knew this and intentionally chose it anyway, or just didn't know.

[1] https://en.wikipedia.org/wiki/SHA-1

[2] https://www.schneier.com/blog/archives/2005/02/cryptanalysis...

EDIT: sibling comment has evidence that Linus did in fact know about it and considered it an impractical vector at the time

https://git.vger.kernel.narkive.com/9lgv36un/zooko-zooko-com...


Why isn't Git using something else? Why go to the trouble of implementing something like that?

I don't mean that as some ridiculing criticism, I just am genuinely puzzled.


> Why isn't Git using something else?

Because switching to a different hash algorithm would break compatibility with all existing Git clients and repositories.


Changing out the hashing algorithm in Git is a very difficult thing to do.


not within the git project so much as all the other code that depends on it


It’s not called SHA-1DC. It’s called “some blob of C whose behaviour is never described anywhere”.


If there's a readily-avaliable blob of C code that does the operation, then by definition it must be described somewhere. Maybe you should get ChatGPT to describe what it does.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: