Smart Cards, TEEs and Derived Credentials

This post has also been published on the blog of the GlobalPlatform TEE Conference.

Smart cards and mobile devices can both be used to carry cryptographic credentials. Smart cards are time-tested vehicles, which provide the benefits of low cost and widely deployed infrastructures. Mobile devices, on the other hand, are emerging vehicles that promise new benefits such as built-in network connections, a built-in user interface, and the rich functionality provided by mobile apps.

Derived Credentials

It is tempting to predict that mobile devices will replace smart cards, but this will not happen in the foreseeable future. Mobile devices are best used to carry credentials that are derived from primary credentials stored in a smart card. Each user may choose to carry derived credentials on zero, one or multiple devices in addition to the primary credentials in a smart card, and may obtain derived credentials for new devices as needed. The derived credentials in each mobile device are functionally equivalent to the primary credentials, and are installed into the device by a device registration process that does not need to duplicate the user proofing performed for the issuance of the primary credentials.

The term derived credentials was coined by NIST in connection with credentials carried by US federal employees in Personal Identity Verification (PIV) cards and US military personnel in Common Access Cards (CAC); but the concept is broadly applicable. Derived credentials can be used for a variety of purposes, Continue reading “Smart Cards, TEEs and Derived Credentials”

Forthcoming Presentation at the GlobalPlatform TEE Conference

I’m happy to announce that I’ll be making a presentation at the forthcoming GlobalPlatform 2014 TEE Conference (September 29-30, Santa Clara, CA). Here are the title and abstract:

Virtual Tamper Resistance for a TEE

Derived credentials are cryptographic credentials carried in a mobile device that are derived from credentials carried in a smartcard. The term was coined by the US National Institute of Standards and Technology (NIST) in connection with US Federal employee credentials, but the concept is generally applicable to use cases encompassing high-security enterprise IDs, payment cards, national identity cards, driver licenses, etc.

The Trusted User Interface feature of a TEE can protect the passcode that activates derived credentials from being phished or intercepted by malware, the user being instructed to only enter the passcode when a Security Indicator shows that the touchscreen is controlled by the Secure OS of the TEE. Besides protecting the passcode, it is also necessary to protect the derived credentials themselves from an adversary who physically captures the device. This requires resistance against tampering. Physical tamper resistance can be provided by a Secure Element accessed from the TEE through the TEE Secure Element API, thus combining protection of the passcode against malware with protection of the credentials against physical capture.

Derived credentials can also be protected against physical capture using cloud-based virtual tamper resistance, which is achieved by encrypting them with a key stored in a secure back-end. The device uses a separate credential derived in part from the activation passcode to authenticate to the back-end and retrieve the encryption key. A novel technique makes it possible to do so without exposing the passcode to an offline guessing attack, so that a short numeric passcode is sufficient to provide strong security.

Physical tamper resistance and virtual tamper resistance have overlapping but distinct security postures, and can be combined, if desired, to maximize security.

Identity-Based Protocol Design Patterns for Machine-to-Machine Secure Channels

Cryptography is an essential tool for addressing the privacy and security issues faced by the Web and the Internet of Things. Sadly, however, there is a chronic technology transfer failure that causes important cryptographic techniques to be underutilized.

An example of an underutilized technique is Identity-Based Cryptography. It is used for secure email, although not broadly. But, to my knowledge, it has never been used to implement secure channel protocols, even though it has the potential to provide great practical advantages over traditional public key infrastructure if put to such usage. We pointed this out in our white paper on TLS. Now we have also shown the benefits of identity-based cryptography for machine-to-machine communications, in a new paper that we will present at the Workshop on Security and Privacy in Machine-to-Machine Communications (M2MSec, San Francisco, October 29, 2014). Machine-to-machine communications fall into many different use cases with very different requirements. So, instead of proposing one particular technique, we propose in the paper four different protocol design patterns that could be used to specify a variety of different protocols.

Update (August 4). I should point out that there is a proposal to use Identity-Based authenticated key exchange in conjunction with MIKEY (Multimedia Internet KEYing), a key management scheme for SRTP (Secure Real-Time Transport Protocol), which itself is used to provide security for audio and video conferencing on the Internet. The proposed authenticated key exchange protocol is called MIKEY-IBAKE and is described in RFC 6267. This is an informational RFC rather than a standards-track RFC, so it’s not clear if the proposed authenticated key exchange method will be eventually deployed. Interestingly, MIKEY-IBAKE uses identity-based encryption rather than identity-based key agreement. This is also what we do in the M2MSec paper, but with a difference. MIKEY-IBAKE uses identity-based encryption to carry ephemeral Elliptic Curve Diffie-Hellman parameters, and thus does not reduce the number of roundtrips. We use identity-based encryption to send a secret from the initiator to the responder, and we eliminate roundtrips by simultaneously sending application data protected with encryption and authentication keys derived from the secret. This gives up replay protection and forward secrecy for the first message; but replay protection, as well as forward secrecy in two of the four patterns, are provided from the second message onward.

Invited Talk at the University of Utah

I’ve been so busy that I haven’t had time to write for more than three months, which is a pity because things have been happening and there is much to report. I’m trying to catch up today.

The first thing to report is that Prof. Gopalakrishnan of the University of Utah invited Karen Lewison and myself to give a joint talk at the University, on May 29. We talked about the need to replace TLS, which I’ve discussed earlier on this blog. The slides can be found at the usual location for papers and presentations at the bottom of each page of this web site.

The University of Utah has a renowned School of Computing and it was quite stimulating to meet with faculty and discuss research after the talk. We were happy to discover common research interests, and we have been exploring the possibility of doing joint research work with Profs. Ganesh Gopalakrishnan, Sneha Kasera, and Tammy Denning; we are thrilled that the prospects look promising.

Other things to report include that we had papers accepted at the forthcoming M2MSec workshop and the forthcoming GlobalPlatform TEE conference. I will report on that in the next two posts.

Derived Credentials in a Trusted Execution Environment (TEE)

In the previous post I discussed the storage of derived credentials (Federal credentials carried in a mobile device instead of a PIV/CAC card) in a software token, i.e. in a cryptographic module implemented entirely in software, whose contents are stored in ordinary flash memory. In this post I will discuss the storage of derived credentials in a Trusted Execution Environment (TEE).

Malware Attacks

As discussed in the previous post and in a technical report, it is possible to protect derived credentials stored in ordinary flash storage by encrypting them under a high entropy key-wrapping key kept in a secure back-end, which the mobile device retrieves by authenticating to the back-end with a key pair regenerated from a protocredential and an activation passcode.

This provides effective protection against an adversary who captures the device while the software token is not active, preventing the adversary from extracting or using the credentials. But it does not provide protection against malware running on the device while the legitimate user is using the device. Such malware can carry out the following attacks:

  1. It can use the derived credentials, by issuing instructions to the software token after it has been activated by the legitimate user.
  2. It can read the plaintext derived credentials from the flash storage after the software token has been activated, and transmit them to the adversary responsible for the malware, who can then use them at will on a different machine.
  3. It can capture the activation passcode by phishing or intercepting it. In a phishing attack, malware prompts the user for the passcode while masquerading as legitimate code that needs the passcode, such as token activation code. In an interception attack, malware gets the passcode after it has been obtained from the user by legitimate code.

The first of these attacks may be impossible to prevent once privileged malware is running on the mobile device without the user being aware of it. But the second and third attacks can be prevented using a TEE as we shall see below; and preventing them is important because they are more damaging than the first attack.

The second attack, extracting the credentials and sending them to the adversary, is more damaging than the first because it cannot be stopped by recovering or wiping the stolen device. Use of an authentication or signature private key cannot be stopped until the associated certificate is revoked and relying parties become aware of the revocation. Correspondents should avoid sending messages encrypted under a symmetric key wrapped by a “key management” public key after becoming aware that the key management certificate has been revoked. But there is no time limit for using a key management private key to decrypt earlier messages that the adversary may have previously captured or may capture in the future, e.g. by breaching the security of a MS Exchange server containing older encrypted messages.

The third attack is even more damaging for several reasons. First, it enables the first two attacks, because once it has the passcode, malware can activate the software token and use and extract the plaintext derived credentials. Second, if the adversary captures the device after using malware to obtain the passcode, the adversary can use the device, or install more comprehensive malware that is able to extract the credentials. Third, the passcode may be independently exploitable because it may be used for other purposes.

A TEE has security features that make it possible to prevent the second and third attacks.

Features of a TEE

A TEE is a computing environment provided by a secure OS running on the same processor as a normal OS. One or more trusted applications (TAs) run under the secure OS. A hardware bus architecture ensures that a portion of the flash storage can only be accessed by the secure OS. Both OSes can access the touchscreen, but a security indicator lets the user know when the screen is controlled by the secure OS and the user interface can be trusted. GlobalPlatform is developing TEE specifications, including a Trusted User Interface API specification, which can be downloaded from the GlobalPlatform site. TEEs are provided by ARM Cortex-A processors, where a TEE is also referred to as a TrustZone. A TA running in a TEE can be used to implement a cryptographic module in which derived credentials can be stored and used.

Using a TEE to Protect Derived Credentials

Derived credentials stored and used in a cryptographic module implemented within a TEE can be protected against the second malware attack discussed above by making their private keys unextractable from the cryptographic module. The ability to mark private keys as being unextractable is a typical feature of cryptographic modules. The PKCS #11 cryptographic module API, for example, allows private keys to be made non-extractable by setting the value of their CKA_EXTRACTABLE attribute to CK_FALSE. The forthcoming TEE Functional API, mentioned in the TEE white paper, will no doubt allow private keys stored in a cryptographic module within a TEE to be made non-extractable as well.

Furthermore, derived credentials stored in a cryptographic module within a TEE can be protected against the third malware attack using the Trusted User Interface feature of the TEE. The passcode can be prompted for by the TA that implements the cryptographic module, and the user can be instructed to only enter the passcode when a Security Indicator shows that the touchscreen is controlled by the Secure OS of the TEE. The passcode is then protected against phishing and interception by malware, assuming that all TAs can be trusted and that the secure OS is not infected by malware. The latter assumption is motivated by the fact that the secure OS is simpler than the normal OS and presents a much smaller attack surface.

Virtual Tamper Resistance

Using the same processor and a portion of the same storage for the secure OS as for the normal OS has important benefits. It provides greater performance for the secure OS than would typically achieved by a secondary processor located in a secure element, and it saves the cost of the secure element. On the other hand, it means that a TEE is not expected to provide much, if any, tamper resistance. Indeed, the TEE Secure Element API, available at the GlobalPlatform site, is concerned with using together a TEE and a secure element, with the TEE providing a Trusted User Interface, and the secure element providing tamper resistance.

(BTW, some secure elements do provide serious tamper resistance, but tamper resistance is never absolute. A fascinating description of the elaborate anti-tampering countermeasures in a family of Infineon chips, and how they were defeated by an attacker with no insider knowledge, can be found in an 80-minute video demonstration—broken down into ten eight-minute segments—presented at Black Hat 2010.)

But the lack of tamper resistance in a TEE can be remedied using the same technique that I described in the previous post as a solution to the problem of protecting derived credentials stored in a software token. Encrypting the derived credentials under a high entropy key-wrapping key, kept in a secure back-end and retrieved by authenticating to the back-end with a key pair regenerated from a protocredential and an activation passcode, can be viewed as a form of cloud-based virtual tamper resistance.

Combining such virtual tamper resistance with the TEE Trusted User Interface feature would make it possible to implement a cryptographic module that would protect both the derived credentials and their activation passcode.

Protecting Derived Credentials without Secure Hardware in Mobile Devices

NIST has recently released drafts of two documents with thoughts and guidelines related to the deployment of derived credentials,

and requested comments on the drafts by April 21. We have just sent our comments and we encourage you to send yours.

Derived credentials are credentials that are derived from those in a Personal Identity Verification (PIV) card or Common Access Card (CAC) and carried in a mobile device instead of the card. (A CAC card is a PIV card issued by the Department of Defense.) The Electronic Authentication Guideline, SP 800-63, defines a derived credential more broadly as:

A credential issued based on proof of possession and control of a token associated with a previously issued credential, so as not to duplicate the identity proofing process.

A PIV/CAC card may carry a PIV authentication credential, a digital signature credential, a current key management credential and up to 20 retired key management credentials, each credential consisting of a private key and an associated certificate that contains the corresponding public key. The digital signature private key is used for signing email messages, and the key management keys for decrypting symmetric keys used to encrypt email messages. The retired key management keys are needed to decrypt old messages that have been saved encrypted. The PIV authentication credential is mandatory for all users, while the digital signature credential and the current key management credential are mandatory for users who have government email accounts.

A mobile device may similarly carry an authentication credential, a digital signature credential, and current and retired key management credentials. Although this is not fully spelled out in the NIST documents, the current and retired key management private keys in the mobile device should be able to decrypt the same email messages as those in the card, and therefore should be the same as those in the card, except that we see no need to limit the number of retired key management private keys to 20 in the mobile device. The key management private keys should be downloaded to the mobile device from the escrow server that should already be in use today to recover from the loss of a PIV/CAC card containing those keys. On the other hand the authentication and digital signature key pairs should be generated in the mobile device, and therefore should be different from those in the card.

In a puzzling statement, SP 800-157 insists that only an authentication credential can be considered a “derived PIV credential”:

While the PIV Card may be used as the basis for issuing other types of derived credentials, the issuance of these other credentials is outside the scope of this document. Only derived credentials issued in accordance with this document are considered to be Derived PIV credentials.

Nevertheless, SP 800-157 discusses details related to the storage of digital signature and key management credentials in mobile devices in informative appendix A and normative appendix B.

Software Tokens

The NIST documents provide guidelines regarding the lifecycle of derived credentials, their linkage to the lifecycle of the PIV/CAC card, their certificate policies and cryptographic specifications, and the storage of derived credentials in several kinds of hardware cryptographic modules, which the documents refer to as hardware tokens, including microSD tokens, UICC tokens, USB tokens, and embedded hardware tokens. But the most interesting, and controversial, aspect of the documents concerns the storage of derived credentials in software tokens, i.e. in cryptographic modules implemented entirely in software.

Being able to store derived credentials in software tokens would mean being able to use any mobile device to carry derived credentials. This would have many benefits:

  1. Federal agencies would have the flexibility to use any mobile devices they want.
  2. Federal agencies would be able to use inexpensive devices that would not have to be equipped with special hardware for secure storage of derived credentials. This would save taxpayer money and allow agencies to do more with their IT budgets.
  3. Mobile authentication and secure email solutions used by the Federal Government would be affordable and could be broadly used in the private sector.

The third benefit would have huge implications. Today, the requirement to use PIV/CAC cards means that different IT solutions must be developed for the government and for the private sector. IT solutions specifically developed for the government are expensive, while private sector solutions too often rely on passwords instead of cryptographic credentials. Using the same solutions for the government and the private sector would lower costs and increase security.

Security

But there is a problem. The implementation of software tokens hinted at in the NIST documents is not secure.

NISTIR 7981 describes a software token as follows:

Rather than using specialized hardware to store and use PIV keys, this approach stores the keys in flash memory on the mobile device protected by a PIN or password. Authentication operations are done in software provided by the application accessing the IT system, or the mobile OS.

And SP 800-157 adds the following:

For software implementations (LOA-3) of Derived PIV Credentials, a password-based mechanism shall be used to perform cryptographic operations with the private key corresponding to the Derived PIV Credential. The password shall meet the requirements of an LOA-2 memorized secret token as specified in Table 6, Token Requirements per Assurance Level, in [SP800-63].

Taken together, these two paragraphs seem to suggest that the derived credentials should be stored in ordinary flash memory storage encrypted under a data encryption key derived from a PIN or password satisfying certain requirements. What requirements would ensure sufficient security?

Smart phones are frequently stolen, therefore we must assume that an adversary will be able to capture the mobile device. After capturing the device the adversary can immediately place it in a metallic box or other Faraday cage to prevent a remote wipe. The contents of the flash memory storage may be protected by the OS, but in many Android devices, the OS can be replaced, or rooted, with instructions for doing so provided by Google or the manufacturer. OS protection may be more effective in some iOS devices, but since a software token does not provide any tamper resistance by definition, we must assume that the adversary will be able to extract the encrypted credentials. Having done so, the adversary can mount an offline password guessing attack, testing each password guess by deriving a data encryption key from the password, decrypting the credentials, and checking if the resulting plaintext contains well-formed credentials. To carry out the password guessing attack, the adversary can use a botnet. Botnets with tens of thousands of computers can be easily rented by the day or by the hour. Botnets are usually programmed to launch DDOS attacks, but can be easily reprogrammed to carry out password cracking attacks instead. The adversary has at least a few hours to run the attack before the authentication and digital signature certificates are revoked and the revocation becomes visible to relying parties; and there is no time limit for decrypting the key management keys and using them to decrypt previously obtained encrypted email messages.

To resist such an attack, the PIN or password would need to have at least 64 bits of entropy. According to Table A.1 of the Electronic Authentication Guideline (SP 800-63), a user-chosen password must have more than 40 characters chosen appropriately from a 94-character alphabet to achieve 64 bits of entropy. Entering such a password on the touchscreen keyboard of a smart phone is clearly unfeasible.

SP 800-157 calls instead for a password that meets the requirements of an LOA-2 memorized secret token as specified in Table 6 of SP 800-63, which are as follows:

The memorized secret may be a randomly generated PIN consisting of 6 or more digits, a user generated string consisting of 8 or more characters chosen from an alphabet of 90 or more characters, or a secret with equivalent entropy.

The equivalent entropy is only 20 bits. Why does Table 6 require so little entropy? Because it is not concerned with resisting an offline guessing attack against a password that is used to derive a data encryption key. It is instead concerned with resisting an online guessing attack against a password that is used for authentication, where password guesses can only be tested by attempting to authenticate to a verifier who throttles the rate of failed authentication attempts. In Table 6, the quoted requirement on the memorized secret token is coupled with the following requirement on the verifier:

The Verifier shall implement a throttling mechanism that effectively limits the number of failed authentication attempts an Attacker can make on the Subscriber’s account to 100 or fewer in any 30-day period.

and the necessity of the coupling is emphasized in Section 8.2.3 as follows:

When using a token that produces low entropy token Authenticators, it is necessary to implement controls at the Verifier to protect against online guessing attacks. An explicit requirement for such tokens is given in Table 6: the Verifier shall effectively limit online Attackers to 100 failed attempts on a single account in any 30 day period.

Twenty bits is not sufficient entropy for encrypting derived credentials, and requiring a password with sufficient entropy is not a feasible proposition.

Solutions

But the problem has solutions. It is possible to provide effective protection for derived credentials in a software token.

One solution is to encrypt the derived credentials under a high-entropy key that is stored in a secure back-end and retrieved when the user activates the software token. The problem then becomes how to retrieve the high-entropy key from the back-end. To do so securely, the mobile device must authenticate to the back-end using a device-authentication credential stored in the mobile device, which seems to bring us back to square one. However, there is a difference between the device-authentication credential and the derived credentials stored in the token: the device-authentication credential is only needed for the specific purpose of authenticating the device to the back-end and retrieving the high-entropy key. This makes it possible to use as device-authentication credential a credential regenerated on demand from a PIN or password supplied by the user to activate the token and a protocredential stored in the device, in a way that deprives an attacker who captures the device of any information that would make it possible to test guesses of the PIN or password offline.

The device-authentication credential can consist, for example, of a DSA key pair whose public key is registered with the back-end, coupled with a handle that refers to a device record where the back-end stores a hash of the registered public key. In that case the protocredential consists of the device record handle, the DSA domain parameters, which are (p,q,g) with the notations of the DSS, and a random high-entropy salt. To regenerate the DSA key pair, a key derivation function is used to compute an intermediate key-pair regeneration key (KPRK) from the activation PIN or password and the salt, then the DSA private and public keys are computed as specified in Appendix B.1.1 of the DSS, substituting the KPRK for the random string returned_bits produced by a random number generator.

To authenticate to the back-end and retrieve the high-entropy key, the mobile device establishes a TLS connection to the back-end, over which it sends the device record handle, the DSA public key, and a signature computed with the DSA private key on a challenge derived from the TLS master secret. (Update—April 24, 2014: The material used to derive the challenge must also include the TLS server certificate of the back-end, due to a recently reported UKS vulnerability of TLS. See footnote 2 of the technical report.) The DSA public and private keys are deleted after authentication, and the back-end keeps the public key confidential. An adversary who is able to capture the device and extract the protocredential has no means of testing guesses of the PIN or password other than regenerating the DSA key pair and attempting online authentication to the back-end, which locks the device record after a small number of consecutive failed authentication attempts that specify the handle of the record.

An example of a derived credentials architecture that uses this solution can be found in a technical report.

Other solutions are possible as well. The device-authentication credential itself could serve as a derived credential, as we proposed earlier; SSO can then be achieved by sharing login sessions, as described in Section 7.5 of a another technical report. And I’m sure others solutions can be found.

Other Topics

There are several other topics related to derived credentials that deserve discussion, including the pros and cons of storing credentials in a Trusted Execution Environment (TEE), whether biometrics should be used for token activation, and whether derived credentials should be used for physical access. I will leave those topics for future posts.

Update (April 10, 2014). A post discussing the storage of derived credentials in a TEE is now available.

It’s Time to Redesign Transport Layer Security

One difficulty faced by privacy-enhancing credentials (such as U-Prove tokens, Idemix anonymous credentials, or credentials based on group signatures), is the fact that they are not supported by TLS. We noticed this when we looked at privacy-enhancing credentials in the context of NSTIC, and we proposed an architecture for the NSTIC ecosystem that included an extension of TLS to accommodate them.

Several other things are wrong with TLS. Performance is poor over satellite links due to the additional roundtrips and the transmission of certificate chains during the handshake. Client and attribute certificates, when used, are sent in the clear. And there has been a long list of TLS vulnerabilities, some of which have not been addressed, while others are addressed in TLS versions and extensions that are not broadly deployed.

The November SSL Pulse reported that only 18.2% of surveyed web sites supported TLS 1.1, which dates back to April 2006, only 20.7% supported TLS 1.2, which dates back to August 2008, and only 30.6% had server-side protection against the BEAST attack, which requires either TLS 1.1 or TLS 1.2. This indicates upgrade fatigue, which may be due to the age of the protocol and the large number of versions and extensions that it has accumulated during its long life. Changing the configuration of a TLS implementation to protect against vulnerabilities without shutting out a large portion of the user base is a complex task that IT personnel is no doubt loath to tackle.

So perhaps it is time to restart from scratch, designing a new transport layer security protocol — actually, two of them, one for connections and the other for datagrams — that will incorporate the lessons learned from TLS — and DTLS — while discarding the heavy baggage of old code and backward compatibility requirements.

We have written a new white paper that recapitulates the drawbacks of TLS and discusses ingredients for a possible replacement.

The paper emphasizes the benefits of redesigning transport layer security for the military, because the military in particular should be very much interested in better transport layer security protocols. The military should be interested in better performance over satellite and radio links, for obvious reasons. It should be interested in increased security, because so much is at stake in the security of military networks. And I would argue that it should also be interested in increased privacy, because what is viewed as privacy on the Internet may be viewed as resistance to traffic analysis in military networks.

Surveillance and Internet Identity

Last week I attended IIW 17, the 17th meeting of the Internet Identity Workshop, which is held twice a year in Mountain View, California. As usual it was a great opportunity to exchange ideas and meet people, with its unconference format, its many sessions, its rotating demos, its wide space for discussions, and its two free dinners with free drinks.

For me, however, it was tinged with sadness, because of what has happened since the first IIW I attended, IIW 12, in May 2011. IIW 12 was the first IIW after the launch of NSTIC. IIW 17 was the first IIW after Snowden.

The NSTIC Strategy Document, released in April 2011 with a preface signed by President Obama, repeatedly emphasized the goal of enhancing privacy as a key element of the “vision” and “guiding principles” of NSTIC. The document explicitly stated that the Identity Ecosystem will use privacy-enhancing technology and policies to inhibit the ability of service providers to link an individual’s transactions, thus ensuring that no one service provider can gain a complete picture of an individual’s life in cyberspace. At the time, Facebook Connect was threatening to inject Facebook as a middleman in all or most Internet activities, and I was happy to see that the US Government seemingly wanted to prevent such a massive invasion of privacy; I even convened a session at IIW 12 proposing a technique for achieving the privacy goals of NSTIC in the short term. Little did I know that the government was busy building a massive surveillance apparatus that would give the government a complete picture of an individual’s life in cyberspace, by means including bulk collection of data from service providers.

The Internet, given to the world by the US Department of Defense, was a world-wide forum for free-flowing, spontaneous exchange of ideas. Now the NSA, part of the same Department of Defense, has taken that away. People know that they are being tracked and identified when they post an anonymous comment. People know that their conversations are being recorded. Therefore people must think twice about they say.

I don’t know if Congress will be able to rein in the NSA. It should be clear that spying on US citizens is unconstitutional, but some politicians think that it is the NSA’s job to spy on everybody else in the planet. They don’t seem to consider or care that, if the US Government insists on a God-given right to spy on everybody else, other countries or regions may develop their own national or regional networks, separated from the US Internet by an air gap.

Fortunately, the technical community has reacted strongly against the NSA’s attacks on Internet privacy. And thanks to Snowden’s revelations, many of the attack techniques are known. It may therefore be possible to protect Internet privacy by technical means.

Coming back to the subject of the workshop, Internet Identity, I would argue that the first thing to do to protect Internet privacy is to get rid of the pernicious technology variously known as third-party login, social login or federated login. To be precise, I am referring to authentication techniques where the user authenticates to a third-party identity provider, which then provides identity and/or attribute information to a relying party, using a protocol such as OAuth or OpenID Connect. (These are the techniques in Group 2 of the taxonomy proposed in the paper Privacy Postures of Authentication Technologies.)

The only intrinsic advantage of federated login is that it allows the identity provider to collect vast amounts of information about the user, since the identity provider learns not only the user’s identity and/or attributes, but also what relying parties the user logs in to. The identity provider uses the information to sell ads that target the user accurately. We now know that the information is also shared with the government, which makes it available to thousands of analysts and IT personnel who use it for legal or illegal government or personal purposes.

There are no other intrinsic advantages to federated login.

The government and the identity providers argue that federated login is more secure than direct authentication to the relying party with username and password, but the opposite is true.

Security is supposedly increased because federated login reduces password reuse. But password reuse will not be substantially reduced unless a large majority of world-wide web sites force their users to use federated login with one of a small number of global identity providers such as Google or Facebook, something that will hopefully not come to pass.

Security is also supposedly increased because a large identity provider supposedly does a better job of protecting the user’s password. But I don’t know why a large identity provider would provide better protection against hackers, since large companies are not known to provide great security. And I do know that a password entrusted to a large identity provider may become available to thousands of employees of the government, of government contractors, and of the identity provider.. And the capture of a password used at an identity provider, which provides access to multiple web sites, is more damaging to the user than the capture of a password used at a single web site.

There is an alternative to authenticating to a web site with username and password that provides both security and privacy: namely, authentication with a cryptographic key pair automatically generated on the user’s machine when the user registers with the site. The site stores the hash of the public key component of the key pair in its database, and uses it to locate the user’s account when the user visits the site again and demonstrates knowledge of the private key component.

Another claimed advantage of federated login is that the user can register at a new site with a single click if logged in to the identity provider, any personal data required by the site being provided by the identity provider. This is a real advantage, but not an intrinsic one. The same benefit could be easily obtained by storing the personal data in the browser, and specifying a protocol by which the browser would supply selected personal data items to a web site upon demand by the site and approval by the user. Such a protocol would be much simpler than any of the federated login protocols and would provide more security and more privacy.

Yet another claimed advantage of federated login is that the identity provider could provide the relying party with a user’s identity and/or attributes verified by an identity proofing procedure; however, such verified identity and/or attributes could equally well be provided by a certificate authority using a public key certificate (or by multiple authorities providing a combination of a certificate binding a public key to an identity and one or more certificates binding the identity to various attributes), without the certificate authority having to be informed of what relying parties the certificate is submitted to.

It is sometimes argued (cf. the NSTIC 101 session at last week’s IIW) that using public key cryptography for authentication would be expensive and would require the user to carry a separate dongle or smartcard for every credential. This is not true. There is no need for special hardware to store a cryptographic credential, and if special hardware is desired for some reason, there is no need to use different pieces of hardware for different credentials.

Two sessions at IIW 17 gave me hope that Internet privacy is not a lost cause.

One of them was convened by Tim Bray of Google to report on the comments he received in response to a blog post arguing to developers that they should use federated login rather than login with username and password. The comments, which he referred to as a “bloodbath,” showed that neither developers nor end-users like federated login. I hope that such pushback will eventually force companies like Google to give up on federated login.

The other one was convened by Kazue Sako of NEC to discuss anonymous credentials and their possible uses. The room was overflowing and the level of engagement of the audience was high, showing that technical people are interested in privacy-enhancing authentication technologies even if large companies are not.

Feedback on the Paper on Privacy Postures of Authentication Technologies

Many thanks to every one who provided feedback on the paper on privacy postures of authentication technologies which was announced in the previous blog post. The paper was discussed on the Identity Commons mailing list and we also received feedback at the ID360 conference, where we presented the paper, and at IIW 16, where we showed a poster summarizing the paper. In this post I will recap the feedback that we have received and the revisions that we have made to the paper based on that feedback.

Steven Carmody pointed out that SWITCH, the Swiss InCommon federation, has developed an extension of Shibboleth called uApprove that allows the identity or attribute provider to ask the user for consent before disclosing attributes to the relying party. Ken Klingestein told us that the Scalable Privacy NSTIC pilot is developing a privacy manager that will let the user choose what attributes will be disclosed to the the relying party by the Shibboleth identity provider. We have added references to these Shibboleth extensions to Section 4.2 of the paper.

The original paper explained that, although a U-Prove token does not provide multishow unlinkability, the user may obtain multiple tokens from the issuer, and present different tokens to different relying parties. Christian Paquin said that a U-Prove credential is defined as a batch of such tokens, created simultaneously by an efficient parallel procedure. We have added this definition of a U-Prove credential to Section 4.3.

Christian Paquin also pointed out that a U-Prove token is a mathematical concept that can be embodied in a variety of technologies. He sent me a link to the WS-Trust embodiment, which was used in CardSpace. We have explained this and included the link in Section 4.3.

Tom Jones said that what we call anonymity is called pseudonymity by others. In fact, column 9, labeled “Anonymity”, covers both pseudonymity, as provided, e.g., by an Idemix pseudonym or an uncertified key pair or a combination of a user ID and a password when the user ID is freely chosen by the user, and full anonymity, as provided when a relying party learns only attributes that do not uniquely identify the user. I think it is not unreasonable to view anonymity (the service provider does not learn the user’s “name”) as encompassing pseudonymity (the service provider learns a pseudonym instead of the “real name”).

Nat Sakimura provided a lot of feedback, for which we are grateful. He said that Google and Yahoo implemented OpenID Pairwise Pseudonymous Identifiers (PPID), i.e. different identifiers for the same user provided to different relying parties, before ICAM specified its OpenID profile. We have noted this in Section 4.2 of the revised paper and changed the label of row 8 to “OpenID (without PPID)”.

He also said that OpenID Connect supports an ephemeral identifier, which provides anonymity. I was able to find a discussion of an ephemeral identifier in the archives of the OpenID Connect mailing list, but no mention of it in any of the OpenID Connect specifications; so ephemeral identifiers may be added in the future, but they are not there yet.

Nat also argued that OpenID Connect provides multishow unlinkability by different parties and by the same party. I disagree, however. The Subject Identifier in the ID Token makes OpenID Connect authentication events linkable. Furthermore, OpenID is built on top of OAuth, whose purpose is to provide the relying party with access to resources owned by the user by means of an access token. In a typical use case the relying party gets access to the user’s account at a social network such as Facebook, Twitter or Google+. It is unlikely that two relying parties who share information cannot determine that they are both accessing the same account, or that a relying party cannot determine that it has accessed the same account in two different occasions.

Nat said that OpenID Connect can be used for two-party authentication using a “Self-Issued OpenID Provider”. We have added a checkmark to row 11, column 1 of the table to indicate this, and an explanation to Section 4.2.

He also said that OpenID Connect provides group 4 functionality by allowing the relying party to obtain attributes from “distributed attribute providers”. We have mentioned this in Section 4.4 of the revised version of the paper.

Finally, Nat said:

Just by reading the paper, I was not very clear what is the requirement for Issue-show unlinkability. By issuance, I imagine it means the credential issuance. I suppose then it means that the credential verifier (in ISO 29115 | ITU-T X.1254 sense) cannot tell which credential was used though it can attest that the user has a valid credential. Is that correct? If so, much of the technology in group 2 should have n/a in the column because they are independent of the actual authentication itself. They could very well use anonymous authentication or partially anonymous authentication (ISO 29191).

The technologies in group 2 are recursive authentication technologies. The relying party directs the browser to the identity or attribute provider, which recursively authenticates the user and provides a bearer credential to the relying party based on the result of the inner authentication. In all generality there may be multiple inner authentications, as the identity or attribute provider may require multiple credentials. So the authentication process may consist of a tree of nested authentications, with internal nodes of the tree involving group 2 technologies, and leaf nodes other technologies. However, rows 5-11 (group 2) are only concerned with the usual case where the user authenticates to the identity or attribute provider as a returning user with a user ID and a password or some other form of two-party authentication; we have now made that clear in Section 4.2 of the revised paper. In that case there is no issue-show unlinkability.

We have also made a couple of other improvements to the paper, motivated in part by the feedback:

  • We have replaced the word possession with the word ownership in the definition of closed-loop authentication (Section 2), so that it now reads: authentication is closed-loop when the credential authority that issues or registers a credential is later responsible for verifying ownership of the credential at authentication time. The motivation for this change is that, in group 2, the credential is the information that the identity or attribute provider has about the user, and is thus kept by the identity or attribute provider rather than by the user.
  • We have added a distinction between two forms of multishow unlinkability, a strong form that holds even if the credential authority colludes and shares information with the relying parties, and a weak form that holds only if there is no such collusion. The technologies in group 2 that provide multishow unlinkability provide the weak form, whereas Idemix anonymous credentials provide the strong form.

Comparing the Privacy Features of Eighteen Authentication Technologies

This blog post motivates and elaborates on the paper Privacy Postures of Authentication Technologies, which we presented at the recent ID360 conference.

There is a great variety of user authentication technologies, and some of them are very different from each other. Consider, for example, one-time passwords, OAuth, Idemix, and ICAM’s Backend Attribute Exchange: any two of them have little in common.

Different authentication technologies have been developed by different communities, which have created their own vocabularies to describe them. Furthermore, some of the technologies are extremely complex: U-Prove and Idemix are based on mathematical theories that may be impenetrable to non-specialists; and OpenID Connect, which is an extension of OAuth, adds seven specifications to a large number of OAuth specifications. As a result, it is difficult to compare authentication technologies to each other.

This is unfortunate because decision makers in corporations and governments need to decide what technologies or combinations of technologies should replace passwords, which have been rendered even more inadequate by the shift from traditional personal computers to smart phones and tablets. Decision makers need to evaluate and compare the security, usability, deployability, interoperability and, last but not least, privacy, provided by the very large number of very different authentication technologies that are competing in the marketplace of technology innovations.

But all these technologies are trying to do the same thing: authenticate the user. So it should be possible to develop a common conceptual framework that makes it possible to describe them in functional terms without getting lost in the details, to compare their features, and to evaluate their adequacy to different use cases.

The paper that we presented at the recent ID360 conference can be viewed as a step in that direction. It focuses on privacy, an aspect of authentication technology which I think is in need of particular attention. It surveys eighteen technologies, including: four flavors of passwords and one-time passwords; the old Microsoft Passport (of historical interest); the browser SSO profile of SAML; Shibboleth; OpenID; the ICAM profile of OpenID; OAuth; OpenID Connect; uncertified key pairs; public key certificates; structured certificates; Idemix pseudonyms; Idemix anonymous credentials; U-Prove tokens; and ICAM’s Backend Attribute Exchange.

The paper classifies the technologies along four different dimensions or facets, and builds a matrix indicating which of the technologies provide seven privacy features: unobservability by an identity or attribute provider; free choice of identity or attribute provider; anonymity; selective disclosure; issue-show unlinkability; multishow unlinkability by different parties; and multishow unlinkability by the same party. I will not try to recap the details here; instead I will elaborate on observations made in the paper regarding privacy enhancements that have been used to improve the privacy postures of some closed-loop authentication technologies.

Privacy Enhancements for Closed-Loop Authentication

One of the classification facets that the paper considers for authentication technologies is the distinction between closed-loop and open-loop authentication, which I discussed in an earlier post. Closed-loop authentication means that the credential authority that issues or registers a credential is later responsible for verifying possession of the credential at authentication time. Closed-loop authentication may involve two parties, or may use a third-party as a credential authority, which is usually referred to as an identity provider. Examples of third-party closed-loop authentication technologies include the browser SSO profile of SAML, Shibboleth, OpenID, OAuth, and OpenID Connect.

I’ve pointed out before that third-party closed-loop authentication lacks unobservability by the identity provider. Most third-party closed-loop authentication technologies also lack anonymity and multishow unlinkability. However, some of them implement privacy enhancements that provide anonymity and a form of multishow unlinkability. There are two such enhancements, suitable for two different use cases.

The first enhancement consists of omitting the user identifier that the identity provider usually conveys to the relying party. The credential authority is then an attribute provider rather than an identity provider: it conveys attributes that do not necessarily identify the user. This enhancement provides anonymity, and multishow unlinkability assuming no collusion between the attribute provider and the relying parties. It is useful when the purpose of authentication is to verify that the user is entitled to access a service without necessarily having an account with the service provider. This functionality is provided by Shibboleth, which can be used, e.g., to allow a student enrolled in one educational institution to access the library services of another institution without having an account at that other institution.

The core OpenID 2.0 specification specifies how an identity provider conveys an identifier to a relying party. Extensions of the protocol such as the Simple Registration Extension specify methods by which the identity provider can convey user attributes in addition to the user identifier; and the core specification hints that the identifier could be omitted when extensions are used. It would be interesting to know whether any OpenID server or client implementations allow the identifier to be omitted. Any comments?

The second enhancement consists of requiring the identity provider to convey different identifiers for the same user to different relying parties. The identity provider can meet the requirement without allocating large amounts of storage by computing a user identifier specific to a relying party as a cryptographic hash of a generic user identifier and an identifier of the relying party such as a URL. This privacy enhancement is required by the ICAM profile of OpenID. It achieves user anonymity and multishow unlinkability by different parties assuming no collusion between the identity provider and the relying parties; but not multishow unlinkability by the same party. It is useful for returning user authentication.