Revocable Biometrics Discussion at the Internet Identity Workshop

One thing I like about the Internet Identity Workshop (IIW) is its unconference format, which allows for impromptu sessions. A discussion during one session can raise an issue that deserves its own session, and an impromptu session can be called the same day or the following day to discuss it. A good example of this happened at the last IIW (IIW XXII), which was held on April 26-28, 2016 at the Computer History Museum in Mountain View, California.

During the second day of the workshop, a participant in a session drew attention to one of the dangers of using biometrics for authentication, viz. the fact that biometrics are not revocable. This is true in the sense that you cannot change at will the biometric features of the human body, and it is a strong reason for using biometrics sparingly; but I pointed out that there is something called “revocable biometrics”. Participants in the session were surprised and asked me to call a session to explain the concept. Karen Lewison and I made a few slides and we called a session the next day. In this post I will go over the slides and summarize the interesting discussions that took place during the session.

In traditional biometrics, a raw biometric sample such as a bitmap image of a fingerprint, an iris or a face, or a voice recording, is collected from the user at enrollment time. This is the enrollment sample. Then a biometric code describing a set of features found in the enrollment sample is extracted from the raw sample. This is the enrollment code. Then the biometric code itself, or a data structure derived from the code and suitable for matching, is stored as a biometric template for later use in authentication. At authentication time, an authentication sample is collected from the user, an authentication code is extracted from the authentication sample, and the authentication code is matched against the biometric template.

Traditional biometrics are dangerous for user privacy. From a biometric template it is possible to derive a raw biometric sample such that the biometric code extracted from the sample will match the template. Therefore an adversary who captures a user's biometric template can impersonate the user, and the user cannot recover from such compromise because the biometric template is not revocable. If a password is compromised, it can be changed, but a fingerprint that has been compromised cannot be changed.

The general concept of revocable biometrics is illustrated in slide 5. As in traditional biometrics, an enrollment code is extracted from an enrollment sample. But instead of deriving a template from the enrollment code, the code is combined with random bits produced by a random bit generator to produce two things: a biometric key that is registered with the verifier, and helper data that is stored for later use. At authentication time, the authentication code is combined with the helper data to regenerate the biometric key, which may be used for authentication in various ways, e.g. as bearer token or as a symmetric signature key used to sign a challenge.

With hopefully high probability, the biometric key regenerated at authentication time is identical to the biometric key generated at enrollment time if the authentication sample is genuine, i.e. if it comes from the authentic user, even though the authentication code is not identical to the enrollment code. On the other hand, with hopefully overwhelming probability, a biometric key produced by combining the helper data with a code extracted from a sample supplied by an impostor is different from the biometric key generated at enrollment time.

The biometric key is revocable if compromised, because fresh random bits can be used to generate a new biometric key together with new helper data. Furthermore, it should be unfeasible to derive any useful biometric information from the helper data.

There are several techniques for implementing the general concept of revocable biometrics. The technique that seems to have been most successful makes use of some error correction system, as shown in slide 7. At enrollment time, the biometric key is generated at random, then redundancy is added to it to produce a codeword of the error correction system. (The prefix code in codeword is error correction terminology, unrelated to the use of the word code in biometric code.) The codeword is then x-ored with the enrollment code to produce the helper data. At authentication time, the authentication code is x-ored with the helper data. As shown in the slide, by the simple fact that the x-or operation is associative, the result is equal to the x-or of the codeword and the bit string obtained by x-oring the authentication code with the enrollment code. If the authentication sample is genuine, the two codes are similar and the bit string consists mostly of 0's, with 1's at the bit positions where the codes differ. The effect of x-oring the bit string with the codeword is to toggle the bits of the codeword at those positions, an effect analogous to bit errors caused by transmission over a noisy channel. The error correction system is carefully chosen and configured so that it can correct the bit errors to recover the original codeword and the biometric key from which the codeword is derived.

It seems to be generally acknowledged that the best results with revocable biometrics were obtained by Hao, Anderson and Daugman in 2005 working with iris images. Their work is reported in the paper “Combining biometrics with cryptography effectively”, IEEE Transactions on Computers 55(9), pages 1081-1088, 2006. An earlier version of the paper is available online as a Technical Report Number 640 of the University of Cambridge Computer Laboratory. They reported generating a 140-bit biometric key and achieving a 0.47% False Rejection Rate (FRR) with an error correction system configured to achieve a 0% False Acceptance Rate (FAR), in one particular experiment that I won't try to describe here.

Slide 8 points out two serious caveats of revocable biometrics. One general caveat is that there is a tradeoff between the FRR that can be achieved for a 0% FAR, and the entropy of the biometric key. For modalities other than iris, a reasonable FRR may only be achievable with biometric keys that have very low entropy. A caveat specific to revocable biometrics based on error correction technology is the fact that the x-or of the helper data with the biometric key is the enrollment code. Even if the helper data by itself reveals no useful biometric information, the user's biometric is still vulnerable to an adversary who captures both the helper data and the biometric key.

After the first eight slides, there was an interesting discussion among the participants in the session, which included several biometric experts. One issue that was discussed is why revocable biometrics are not better known and are not deployed in the real world, even though there is a large body of academic literature on the subject. One possible reason is technology transfer failure. A second reason may be the fact that revocable biometrics can be used for authentication but not for identification. A third reason could be that good results are perhaps only achievable in a laboratory setting: Hao, Anderson and Daugman reported using infrared light for their experiments, and it was reported during the discussion that Daugman hired an ophthalmologist to photograph the iris images.

Another issue that was discussed is that revocable biometrics do not address the problem of biometric spoofing. The fingerprint sensors available on smart phones are easily spoofed. This is related to the issue of liveness: a biometric authentication factor is the “something you are” element of the triple (“something you know”, “something you have ”, “something you are”), only if it can be verified that the biometric sample comes from the body of the individual who is trying to authenticate; and verifying liveness is difficult.

A third issue that was discussed is how to cope with the low entropy of biometric keys, and of biometric authentication factors in general. One solution that was mentioned is to combine biometrics with other factors. I then moved on to slides 10-11, which are related to multifactor authentication. Slide 10 proposes three-factor authentication using a biometric key, a password, and an uncertified key pair. Slide 11 points out that the password used in such three-factor authentication deserves protection against a security breach of a back-end database used by the verifier, even though it cannot by itself be used to authenticate to the verifier. Users often reuse passwords, and therefore a password has intrinsic value for the user, which should be protected.

Two methods of protecting the password are sketched out. One method is to store in the database a joint hash of the password, the biometric key, and the public key component of the uncertified key pair, instead of hashing the password with a salt stored in the database, as is usually done. The public key, which has high entropy, is treated as a shared secret between the verifier and the user's computing device. If it is not stored in the verifier's database, an adversary who breaches the security of the database and captures the joint hash is not able to use it to mount a dictionary attack against the password.

Another method is to use the password and the biometric key to regenerate the uncertified key pair from a protocredential, instead of sending them to the verifier as bearer tokens.

We have a patent granted on the protocredential method (US patent 9,185,111) and a patent pending on the joint hash method.