Omission-Tolerant Checksum

Identity Verification: A Coronavirus Challenge to the Financial World

Updated April 1st, 2020

This blog post has been coauthored with Karen Lewison

The coronavirus pandemic is causing unprecedented disruption throughout the business world. Businesses that are not able to cope with public health orders and new customer behaviors are going out of business, while businesses that are able to adapt are thriving and expanding their market share. Disruption will be temporary in sectors of the economy where face-to-face interaction adds value to the business-to-customer relationship and a physical presence on the street is an essential requirement of the business model; gyms, bars and conference centers will no doubt reopen once the pandemic has been controlled. But changes brought by the pandemic will be permanent in sectors of the economy where face-to-face interaction adds no value and a physical presence is a legacy of a traditional business model. One of those sectors is the financial world.

A challenge to financial institutions

Financial institutions have been less impacted than other businesses by the pandemic. In the US, the entire financial sector has been declared critical infrastructure by DHS and is thus protected against closure orders by states or counties. And most financial transactions are now conducted online using web browsers or mobile apps, without face-to-face interactions that would put employees and customers at risk of contagion. Nevertheless, coronavirus poses a challenge to financial institutions: how to verify the identity of new customers.

Pomcor Granted Patent on Rich Credentials

Pomcor has been granted US Patent 10,567,377, Multifactor Privacy-Enhanced Remote Identification Using a Rich Credential. Karen Lewison is the lead inventor and I am a coinventor. Pomcor has so far been granted a total of eight patents, two of which we have sold. The remaining six patents that we own are listed in the Patents page of this web site.

This latest patent is special because it provides a solution to a major societal problem: how to identify people over the Internet with strong security. Techniques are available for authenticating repeat visitors to a web site or current users of a web application. But authentication techniques are only applicable once a relationship has been established. They are not applicable when somebody wants to establish a new relationship, e.g. by becoming a new customer of a bank, or signing up with a robo advisor, or applying for a mortgage, or renting an apartment, or switching to a different car insurance.

A Formal Proof of Omission-Tolerant Integrity Protection

This is the fourth and last part of a series on omission-tolerant integrity protection and related topics. See also part 1, part 2, and part 3. A technical report on the topic is available on this site and in the IACR ePrint Archive.

In part 3 we saw how the system parameterization concept introduced by Boneh and Shoup in their Graduate Course in Applied Cryptography makes it possible to provide a formal definition of collision resistance for keyless hash functions. In this last post I will explain how that definition, and the security and system parameters used in the definition, are used in the proof of Theorem 3 of the technical report, which states that the root label of a typed hash tree can serve as an omission-tolerant checksum of an encoding of a set of key-value pairs.

Mapping a Formal Definition of Collision Resistance to Existing Implementations of Hash Functions

This is part 3 of a series on omission-tolerant integrity protection and related topics. See also part 1 and part 2. A technical report on the topic is available on this site and in the IACR ePrint Archive.

When a cryptographic hash is used as a checksum of a bit string, security depends on the collision resistance of the hash function. When the root label of a typed hash tree is used as an omission-tolerant checksum of a set of key-value pairs, security also depends on the collision resistance of a hash function, in this case of the hash function that is used to construct the typed hash tree. (It may also depend on the preimage resistance of the hash function depending on how the set is encoded, as we shall see in another post of this series.) The formal proof of security in the technical report is therefore based on a formal definition of collision resistance. But, surprising as this may seem, there is no standard, widely used, formal definition of collision resistance for the keyless, or unkeyed hash functions such as SHA256 that are used in practice.

Collision resistance is hard to define

The concept of a hash function collision is simple and clear enough: it is a pair of distinct elements of the domain of the hash function that have the same image. The concept of collision resistance, on the other hand, is difficult to define. The reason for this, as explained for example by Katz and Lindell in their Introduction to Modern Cryptography (Chapman & Hall/CRC, 2nd edition, 2014, bottom of page 155), is that collisions exist, and for any collision there exists a trivial algorithm that hardcodes the collision and outputs it in constant time.

Using an Omission-Tolerant Checksum for Selective Disclosure of Attributes Asserted by a Public Key Certificate

This is part 2 of a series on omission-tolerant integrity protection and related topics. A technical report on the topic is available on this site and in the IACR ePrint Archive.

The first post of this series introduced a new technical report that describes the concept of an omission-tolerant checksum in a broader context than the rich credentials paper and includes a formal proof of security in an asymptotic security setting. In this post I give an example showing how an omission-tolerant checksum implemented using a typed hash tree can provide selective disclosure of attributes asserted by a public key certificate.

Typed hash trees vs. Merkle trees

A typed hash tree is a hash tree where every node has a type in addition to a label, and the label of an internal node is a cryptographic hash of an encoding of the types and labels of its children. A distinguished type is assigned to all the internal nodes, and may also be assigned to some of the leaf nodes.

Hash trees were first proposed by Merkle. In a Merkle tree, each internal node is labeled by a hash of the concatenation of the labels of its children, and each leaf node is labeled by a hash of a data block. In both a typed hash tree and a Merkle tree, a subtree can be pruned without modifying the root label of the tree. (By “pruning a subtree” we mean removing the descendants of an internal node.) But in the case of a Merkle tree this is a “bug” to be mitigated if the tree is to provide integrity protection, while in the case of a typed hash tree it is a “feature” that allows the root label of the tree to be used as an omission-tolerant checksum.

An Omission-Tolerant Cryptographic Checksum

This is part 1 of a series on omission-tolerant integrity protection and related topics. A technical report on the topic is available on this site and in the IACR ePrint Archive.

Broadly speaking, an omission-tolerant cryptographic checksum is a checksum on data that does not change when items are removed from the data but makes it infeasible for an adversary to modify the data in other ways without invalidating the checksum.

We discovered the concept of omission-tolerant integrity protection while working on rich credentials. A rich credential includes subject attributes and verification data stored in a typed hash tree. We noted in an interim report that the root label of the tree could be viewed as an “omission-tolerant cryptographic checksum”. Prof. Phil Windley, who read the report, told us that he had not seen the concept before, and asked if we had invented it. We then added a section on typed hash trees and omission-tolerant integrity protection to the final report.

We've now written a new technical report that discusses omission-tolerant checksums and omission-tolerant integrity protection in a broader context than rich credentials. The main contributions of the new paper are a formal definition of omission-tolerant integrity protection, a method of computing an omission-tolerant checksum on a bit-string encoding of a set of key-value pairs, and a formal proof of security in an asymptotic security setting that uses the system parameterization concept introduced by Boneh and Shoup in their online book.

I have not said much in this blog about omission-tolerant integrity protection, and there is a lot to say: how an omission-tolerant checksum can be used to implement selective disclosure of subject attributes in public key certificates; how public key certificates with selective disclosure could easily provide security and privacy for client authentication in TLS; what's special about Boneh and Shoup's system parameterization concept and how we use it in our definitions and proofs; how can a typed hash tree provide omission-tolerant integrity protection whereas a Merkle tree cannot; and a number of narrower but no less interesting topics. This is the first of a series of posts on these topics.