Mapping a Formal Definition of Collision Resistance to Existing Implementations of Hash Functions

This is part 3 of a series on omission-tolerant integrity protection and related topics. See also part 1 and part 2. A technical report on the topic is available on this site and in the IACR ePrint Archive.

When a cryptographic hash is used as a checksum of a bit string, security depends on the collision resistance of the hash function. When the root label of a typed hash tree is used as an omission-tolerant checksum of a set of key-value pairs, security also depends on the collision resistance of a hash function, in this case of the hash function that is used to construct the typed hash tree. (It may also depend on the preimage resistance of the hash function depending on how the set is encoded, as we shall see in another post of this series.) The formal proof of security in the technical report is therefore based on a formal definition of collision resistance. But, surprising as this may seem, there is no standard, widely used, formal definition of collision resistance for the keyless, or unkeyed hash functions such as SHA256 that are used in practice.

Collision resistance is hard to define

The concept of a hash function collision is simple and clear enough: it is a pair of distinct elements of the domain of the hash function that have the same image. The concept of collision resistance, on the other hand, is difficult to define. The reason for this, as explained for example by Katz and Lindell in their Introduction to Modern Cryptography (Chapman & Hall/CRC, 2nd edition, 2014, bottom of page 155), is that collisions exist, and for any collision there exists a trivial algorithm that hardcodes the collision and outputs it in constant time.

Continue reading “Mapping a Formal Definition of Collision Resistance to Existing Implementations of Hash Functions”

Using an Omission-Tolerant Checksum for Selective Disclosure of Attributes Asserted by a Public Key Certificate

This is part 2 of a series on omission-tolerant integrity protection and related topics. A technical report on the topic is available on this site and in the IACR ePrint Archive.

The first post of this series introduced a new technical report that describes the concept of an omission-tolerant checksum in a broader context than the rich credentials paper and includes a formal proof of security in an asymptotic security setting. In this post I give an example showing how an omission-tolerant checksum implemented using a typed hash tree can provide selective disclosure of attributes asserted by a public key certificate.

Typed hash trees vs. Merkle trees

A typed hash tree is a hash tree where every node has a type in addition to a label, and the label of an internal node is a cryptographic hash of an encoding of the types and labels of its children. A distinguished type is assigned to all the internal nodes, and may also be assigned to some of the leaf nodes.

Hash trees were first proposed by Merkle. In a Merkle tree, each internal node is labeled by a hash of the concatenation of the labels of its children, and each leaf node is labeled by a hash of a data block. In both a typed hash tree and a Merkle tree, a subtree can be pruned without modifying the root label of the tree. (By “pruning a subtree” we mean removing the descendants of an internal node.) But in the case of a Merkle tree this is a “bug” to be mitigated if the tree is to provide integrity protection, while in the case of a typed hash tree it is a “feature” that allows the root label of the tree to be used as an omission-tolerant checksum.

Continue reading “Using an Omission-Tolerant Checksum for Selective Disclosure of Attributes Asserted by a Public Key Certificate”

An Omission-Tolerant Cryptographic Checksum

This is part 1 of a series on omission-tolerant integrity protection and related topics. A technical report on the topic is available on this site and in the IACR ePrint Archive.

Broadly speaking, an omission-tolerant cryptographic checksum is a checksum on data that does not change when items are removed from the data but makes it infeasible for an adversary to modify the data in other ways without invalidating the checksum.

We discovered the concept of omission-tolerant integrity protection while working on rich credentials. A rich credential includes subject attributes and verification data stored in a typed hash tree. We noted in an interim report that the root label of the tree could be viewed as an “omission-tolerant cryptographic checksum”. Prof. Phil Windley, who read the report, told us that he had not seen the concept before, and asked if we had invented it. We then added a section on typed hash trees and omission-tolerant integrity protection to the final report.

We’ve now written a new technical report that discusses omission-tolerant checksums and omission-tolerant integrity protection in a broader context than rich credentials. The main contributions of the new paper are a formal definition of omission-tolerant integrity protection, a method of computing an omission-tolerant checksum on a bit-string encoding of a set of key-value pairs, and a formal proof of security in an asymptotic security setting that uses the system parameterization concept introduced by Boneh and Shoup in their online book.

I have not said much in this blog about omission-tolerant integrity protection, and there is a lot to say: how an omission-tolerant checksum can be used to implement selective disclosure of subject attributes in public key certificates; how public key certificates with selective disclosure could easily provide security and privacy for client authentication in TLS; what’s special about Boneh and Shoup’s system parameterization concept and how we use it in our definitions and proofs; how can a typed hash tree provide omission-tolerant integrity protection whereas a Merkle tree cannot; and a number of narrower but no less interesting topics. This is the first of a series of posts on these topics.