Discussion
You’re still signing data structures the wrong way
formerly_proven: This article claims that these are somewhat open questions, but they're not and have not been for a long time.#1 You sign a blob and you don't touch it before verifying the signature (aka "The Cryptographic Doom Principle") #2 Signatures are bound to a context which is _not_ transmitted but used for deriving the key or mixed into the MAC or what have you. This is called the Horton principle. It ensures that signer/verifier must cryptographically agree on which context the message is intended for. You essentially cannot implement this incorrectly because if you do, all signatures will fail to verify.The article actually proposes to violate principle #2 (by embedding some magic numbers into the protocol headers and presuming that someone will check them), which is an incorrect design and will result in bad things if history is any indication.Principles #1 and #2 are well-established cryptographic design principles for just a handful of decades each.
ahtihn: Maybe I'm misunderstanding the article but I'm fairly sure the magic number is not transmitted.It's used exactly as you say: a shared context used as input for the signature that is not transmitted.
lokar: What if (and this is perhaps to big an if), you only ever serialize and de-serialize with code generated from the IDL, which always checks the magic numbers (returning a typed object(?
lokar: No, I'm pretty sure they are saying you need to transmit it
jcalvinowens: I think not:> Note that the domain separator does not appear in the eventual serialization (which would waste bytes), since both signer and receiver agree on it via this shared protocol specification.But saying it's about wasting bytes is a little confusing, as you observe that isn't really the point.
nightpool: No, they propose just concatenating it with the data received from the network> it makes a concatenation of the domain separator (@0x92880d38b74de9fb) and the serialization of the object, and then feeds the byte stream into the signing primitive. Similarly, verification of an object verifies this same reconstructed concatenation against the supplied signature.> Note that the domain separator does not appear in the eventual serialization (which would waste bytes), since both signer and receiver agree on it via this shared protocol specification. Encrypt, HMAC, and hash work the same way
Muromec: So another lesson had been relearned from asn.1. I'm proud of working in this industry again! Next we will figure out to always put versions into the data too
jbmsf: That was my first thought as well.
logicallee: along the same lines, did you know that you can get an authenticated email that the listed sender never sent to you? If the third party can get a server to send it to themselves (for example Google forms will send them an email with the contents that they want) they can then forward it to you while spoofing the from: field as Google.com in this example, and it will appear in your inbox from the "sender" (Google.com) and appear as fully authenticated - even though Google never actually sent you that.This is another example where you would think that "who it's for" is something the sender would sign but nope!
jeffrallen: [delayed]
Retr0id: Putting domain separators in the IDL is interesting but you can also avoid the problem by putting the domain separators in-band (e.g. in some kind of "type" field that is always present).Depending on what your input and data model look like, canonicalisation takes O(nlogn) time (i.e. the cost of sorting your fields).Here I describe an alternative approach that produces deterministic hashes without a distinct canonicalization step, using multiset hashing: https://www.da.vidbuchanan.co.uk/blog/signing-json.html
majormajor: I think a lot of people assume that the "name" of the type, for protos, will be preserved somewhere in the output such that a TreeRoot couldn't be re-used as a KeyRevoke. It makes sense that it isn't - you generally don't want to send that name every time - but it's non-obvious to people with a object-oriented-language background who just think "ah, different types are obviously different types." The serialization cost objection is generally what I've often seen against in-bound type fields and such, as well, so having a unique identifier that gets used just for signature computation is clever.What's over my head possibly, from skimming it, about your multiset hashing is how it avoids the "these payloads have the same shape, so one could be re-sent as the other" issue? It seems like a solution to a different problem?
Retr0id: Multiset hashing is not related to the domain separation problem, but it is related to the broader "signing data structures" problem.(I realise my comment reads a bit unclearly, it's basically two separate comments, split after the first paragraph)
sillywabbit: They've reinvented protobuf headers.
jeffrallen: [delayed]