Agreed. Blake3 is super promising as cryptographic hash function family due to its software performance (not sure if anyone has tried making hardware designs for it), but SHA2 hardware acceleration is extremely common and makes it good enough. And while SHA3 software performance is poor it's quite good in hardware, so as more chips accelerate it it'll become generally viable. So while Blake3 looks nice, there's not a whole ton of need for it right now, and I doubt it'll become SHA4 ever.
Sha3 is performant, but I'll always give it the stink eye because of NIST selecting the winner, then modifying their solution before standardization without sufficient explanation. Read the room, this is cryptography; one does not simply add mystery padding and rush off to the printing press.
How much did your opinion change after reading the 10/5 update to Schneier’s post?
From your first link:
I misspoke when I wrote that NIST made “internal changes” to the algorithm. That was sloppy of me. The Keccak permutation remains unchanged. What NIST proposed was reducing the hash function’s capacity in the name of performance. One of Keccak’s nice features is that it’s highly tunable.
I do not believe that the NIST changes were suggested by the NSA. Nor do I believe that the changes make the algorithm easier to break by the NSA. I believe NIST made the changes in good faith, and the result is a better security/performance trade-off.
Can't speak for aliqot, but I am now somewhat more confident that the NIST changes were suggested by the NSA, and slightly more confident (or at least less unconfident) that SHA-3 is insecure.
I still think it's probably fine, but I feel better about insisting on SHA-2+mitigations or blake3 instead now, even if the main problem with SHA-3 is its being deliberately designed to encourage specialized hardware accelleration (cf AES and things like Intel's aes-ni).
(To be clear, the fact that Schneier claims to "believe NIST made the changes in good faith" is weak but nonzero evidence that they did not. I don't see any concrete evidence for a backdoor, although you obviously shouldn't trust me either.)
Does Schneier have some association to the NSA I don't know about? I'd normally consider that statement as weak evidence they did make the changes in good faith.
Making late (in this case, after the competition was already over) changes to a crytographic primitive - without extensive documentation both of why that's necessary (not just helpful) and why it's not possible (not just you promise it doesn't) for that to weaken security or insert backdoors - is a act of either bad faith or sufficiently gross incompetence that it should be considered de facto bad faith.
Schneier claiming to believe that it's good faith implies that he either doesn't understand or (presumably more likely given his history?) doesn't care about keeping the standardization process secure against corruption by the NSA, which suggests either incompetence or bad faith on Schneier's part as well. (Or, in context, that someone was leaning on him after the earlier criticism.)
This is particularly inexcusable since reducing security parameters on the pretense that "56 bits ought to be enough for anybody" is a known NSA tactic dating back to fucking DES.
What does that have to do with anything? AFAIK Schneier's public actions are consistently pro-privacy, pro-encryption. Implying otherwise should be accompanied by evidence don't you think?
It is fair to criticize NIST for enabling rumors about them weakening SHA3, but these are rumors only, nothing more. Please, everyone, stop spreading them. SHA3 is the same Keccak that has gone through extensive scrutiny during the SHA3 competition, and, as far as anyone can tell, it is rock solid. Don't trust NIST, don't trust me, trust the designers of Keccak, who tell you that NIST did not break it:
> NIST's current proposal for SHA-3, namely the one presented by John Kelsey at CHES 2013 in August, is a subset of the Keccak family. More concretely, one can generate the test vectors for that proposal using the Keccak reference code (version 3.0 and later, January 2011). This alone shows that the proposal cannot contain internal changes to the algorithm.
I've implemented SHA-3 and Keccak in hardware (FPGA) and software (CPU, GPU) countless times: there's zero scenario where this single byte change occurring before 24 rounds of massive permutation has any measurable effect on the security of this hash function.
The new padding prepends either 01, 11, or 1111 depending on the variant of SHA-3. That way the different variants don't sometimes give the same [partial] hash.
It was weird to toss that in at the last second but there's no room for a backdoor there.
As far as Blake3 in hardware for anything other than very low-power smart cards or similar:
Blake3 was designed from the ground up to be highly optimized for vector instructions operating on four vectors, each of 4 32-bit words. If you already have the usual 4x32 vector operations, plus a vector permute (to transform the operations across the diagonal of your 4x4 matrix into operations down columns) and the usual bypass network to reduce latency, I think it would rarely be worth the transistor budget to create dedicated Blake3 (or Blake2s/b) instructions.
In contrast, SHA-3's state is conceptually five vectors each of five 32-bit words, which doesn't map as neatly onto most vector ISAs. As I remember, it has column and row operations rather than column and diagonal operations that parallelize better on vector hardware.
SHA-2 is a Merkle-Damgard construction where the round function is a Davies-Meyer construction where the internal block cipher is a highly unbalanced Feistel cipher. Conceptually, you have a queue of 8 words (32-bit or 64-bit, depending on which variant). Each operation pops the first word from the queue, combines it in a nonlinear way with 6 of the other words, adds one word from the "key schedule" derived from the message, and pushes the result on the back of the queue. The one word that wasn't otherwise used is increased by the sum of the round key and a non-linear function of 3 of the other words. As you might imagine, this doesn't map very well onto general-purpose vector instructions. This cipher is wrapped in a step (Davies-Meyer construction) where you save a copy of the state, encrypt the state using the next block of the message, and then add the saved copy to the encrypted result (making it non-invertible, making meet-in-the middle attacks much more difficult). The key schedule uses a variation on a lagged Fibonacci generator to expand each message block into a larger number of round keys.
> Blake3 was designed from the ground up to be highly optimized for vector instructions operating on four vectors, each of 4 32-bit words.
This is true, and the BLAKE family inherits this structure from ChaCha, but there's also more to it than that. If you have enough input to fill many blocks, you can run multiple blocks in parallel. In this situation, rather than dividing up the 16 words of a block into four vectors, you put each word in a different vector, and the words of each vector represent the same position in different blocks. (I.e rather than representing columns or rows, the vectors point "out of the page.) There are several benefits to this arrangement:
1. You don't need to do that diagonalization operation anymore.
2. If your CPU supports "instruction-level parallelism" for vector operation, working across the four words/vectors in a row gets to take advantage of that.
3. Best of all, you're no longer limited to 4-word vectors. If you have enough input to fill 8 blocks (AVX2) or 16 blocks (AVX-512), you can use those much larger instruction sets.
This is all easy to take advantage of in a stream cipher like ChaCha, because each block is independent. With a hash function, things are more complicated, because you usually have data dependencies between different blocks. That's why the tree structure of BLAKE3 (or somewhat similarly, KangarooTwelve) is so important for performance. It's not just about multithreading; it also about SIMD. See section 5.3 of the BLAKE3 paper for more on this.
I don't think so? BLAKE is an evolution of an earlier proposal for a family of hash functions called LAKE, but the paper for that does not explain the name at least.
Oh yeah Blake3 is just the best if it's accessible to you. I get that other algorithms are more ubiquitous. But if I need a crypto hash, I consider Blake first.