A Survey of Existing and Proposed TLS Visibility Solutions

This is the fifth and last post of a series on providing visibility of TLS 1.3 traffic in the intranet. An index to the series and related materials can be found in the TLS Traffic Visibility page.

Update. This post has been updated in response to a clarification received from Nubeva. See the section on SKI below and the next blog post.

It is well known that TLS 1.3 has created a visibility problem for encrypted intranet traffic by removing the static RSA key exchange method. Except in PSK-only mode, TLS 1.3 traffic has forward secrecy protection and cannot be decrypted by a passive middlebox provisioned with a static private key. This is known as the PFS visibility problem, where PFS stands for “perfect” forward secrecy.

But there is no awareness yet of a second problem created by TLS 1.3 that makes it harder to solve the PFS visibility problem than is generally understood. I call it the multiple protection state problem.

TLS 1.2 has PFS cipher suites, and therefore it has its own PFS visibility problem. If a client insists on using a PFS cipher suite, a passive middlebox provisioned with a static private key won't be able to decrypt the traffic. Some existing TLS visibility solutions provide the middlebox with the symmetric keys used to protect the traffic, rather than with the private key used to perform the key exchange. Such solutions are being successfully deployed for decrypting TLS 1.2 traffic. But the multiple protection state problem means that those solutions are not applicable to TLS 1.3.

I realized this as I was working on a survey of TLS visibility solutions. The problem is described in the next section and the survey can be found in the following section.

The Multiple Protection State Problem of TLS 1.3

In the previous post I defined the concept of a protection state, which specifies how client-to-server or server-to-client TLS traffic is protected and how it can be deprotected. A protection state specifies a protection cipher, which may be an AEAD algorithm or a combination of an encryption algorithm and a message authentication method, and the keying material to be used with the protection cipher, which may consist of an encryption key and a MAC key, or an AEAD algorithm key and an IV used for the construction of the AEAD nonces. Details can be found in the Protection States section of the previous post.

In TLS 1.2 there is a single protection state for each direction of traffic. But in TLS 1.3 different protection states are used for different kinds of traffic. Early application data is protected with a AEAD algorithm specified by a cipher suite associated with the first PSK proposed by the client, and keying material derived from an early traffic secret. Some of the handshake messages are encrypted, and they are protected with a possibly different AEAD algorithm, specified by a cipher suite selected by the server, and keying material derived from the handshake traffic secrets. Non-early application data is protected initially with keying material derived from initial values of the application traffic secrets, but the client and the server may update their application traffic secrets at any time during the TLS session and derive new keying material from the updated secrets.

A passive middlebox cannot decrypt TLS 1.3 traffic with a single protection state for each direction of traffic. It needs different keying materials for different kinds of traffic. For non-early application data, it needs an indefinite sequence of sets of keying materials generated long after the handshake. It needs to know when to transition from one set of keying materials to another. And it may have to transition from an AEAD algorithm used for early data to a different AEAD algorithm used for handshake and non-early data; different algorithms are only used when the server rejects the early data, but rejected early data is of interest to intrusion detection systems and must be decrypted because it may help detect or identify an attack.

Therefore visibility solutions that decrypt PFS sessions by sending keying material described as a single “session key” or “symmetric key” to a decryptor will not work for TLS 1.3 even if they work for TLS 1.2.

Survey of Solutions for Providing Visibility of TLS 1.2 and/or TLS 1.3 Traffic in the Intranet

The following table lists TLS visibility solutions that are being used today, have been announced, or have been proposed in various forums. The last three rows of the table refer to the three variants of the Pomcor visibility solution proposed in earlier posts of this series. A brief description of each solution is provided below.

TLS visibility solutions
	PFS	TLS 1.2	TLS 1.3
MITM Proxy	X	X	X
Static (EC)DH Key			X
Key Rotation			X
Key Retention			X
Session Key Forwarding	X	X
Symmetric Key Intercept	X	X	X
RHRD			X
SD Variant	X		X
ST Variant	X		X
2V Variant	X	X	X

Decryption by a Man-In-The-Middle Proxy

An effective and conceptually simple visibility solution consists of using an active visibility middlebox as a man-in-the-middle proxy to terminate an incoming connection from the client and inspect the plaintext before forwarding it to the destination server over a second TLS connection. This solution is also called “break and inspect”, “TLS interception”, or “TLS splitting”. To decrypt North-South traffic the role of visibility middlebox may be played by a firewall or internet gateway. To decrypt East-West traffic, the middlebox may be, for example, a Kubernetes sidecar, as suggested at a workshop on intranet traffic visibility hosted by the Center for Cybersecurity Policy and Law.

In a presentation at RSA 2020 Jesse Rothstein of ExtraHop pointed out a few drawbacks of this solution: it does not support client certificates, it requires a local CA, and some proxies are known to have vulnerable implementations of client-side TLS.

Use of a Static DH or ECDH Private Key

In the key exchange modes of TLS 1.3 other than PSK-only, the server uses an ephemeral DH or ECDH private key. However, nothing prevents the server from cheating and using a static key instead. The static key can be provisioned to the server and a visibility middlebox, and the middlebox can use it in conjunction with the client's public key to compute the TLS shared secret, derive the traffic keys, and decrypt the traffic. Of course, there is then no PFS.

This solution has been formalized by the European Telecommunications Standards Institute (ETSI) in the Middlebox Security Protocol, which requires the server to confess in its certificate that it is cheating.

Frequent Key Rotation

This solution, like the ETSI solution, uses a static DH or ECDH private key that is provisioned to the server and a visibility middlebox. It differs from the ETSI solution in that key is rotated on a frequent schedule. A kind of imperfect forward secrecy is provided by the fact that it becomes impossible to decrypt the traffic once the key has been rotated out and deleted.

This solution was discussed at the above-mentioned workshop held by the Center for Cybersecurity Policy and Law, and later presented by Paul Barrett of Enterprise Netscout Systems at a NIST workshop on TLS 1.3 traffic visibility.

Key Retention

This solution was discussed in the workshop held by the Center for Cybersecurity Policy and Law and is described as follows in the workshop report: “Key retention systems allow the key material generated in forward secrecy schemes such as Ephemeral Diffie-Hellman (DHE) to be stored. The key retention system can retain the session key material for a short time, e.g., to support real-time decryption, or for a longer period to allow post-capture decryption at a later date.”

Like the two previous solutions, this solution provides a form of imperfect forward secrecy. However, the multiple protection state problem discussed above means that it does not work for TLS 1.3.

Session Key Forwarding

This is another solution that works with TLS 1.2 but cannot work with TLS 1.3 due to the multiple protection state problem. It was described by Jesse Rothstein of ExtraHop in the same RSA 2020 presentation where he discussed the MITM Proxy solution. It uses a “lightweight agent” integrated with the TLS server that obtains “the session key” and forwards it to a passive visibility middlebox, referred to as a monitoring device or appliance in the presentation.

But what is meant by “session key”? In TLS 1.2 the term could refer to the key block, which contains the encryption key and the MAC key for each direction of traffic if the session's cipher suite specifies an encryption algorithm, or the AEAD key and an IV used for the construction of the AEAD nonces if the session's cipher suite specifies an AEAD algorithm. But as discussed above in connection with the multiple protection state problem, in TLS 1.3 there are different traffic keys for different kinds of traffic, and the client and the server may generate new traffic keys for non-early application data long after the handshake. There is thus no concept in TLS 1.3 that the term “session key” of the Session Key Forwarding solution could refer to. Furthermore, forwarding traffic keys to a decryptor is not enough to decrypt TLS 1.3 traffic. The decryptor has to know what AEAD algorithm to use for each kind of traffic, and when to transition from one algorithm to another and from one set of traffic keys to another.

The Session Key Forwarding solution has been implemented by ExtraHop and is being deployed at Fiserv, a large global provider of financial services technology. Rothstein's RSA talk was a joint presentation with Joshua Northrup of Fiserv, who reported that the ongoing deployment is already decrypting six thousand PFS sessions per second. However the Fiserv deployment is only handling TLS 1.2 sessions at this time, as can be seen by the fact that the project started two years before the presentation according to Rothstein, i.e. six months before the publication of RFC 8446, and by the fact that a list of supported cipher suites dated January 15, 2021 only includes TLS 1.2 cipher suites. And it will not be able to handle TLS 1.3 sessions in the future without a radical change in technology, because of the multiple protection state problem of TLS 1.3.

Symmetric Key Intercept (SKI)

Nubeva has sent a response to this post explaining that what they extract and send to the decryptor are traffic secrets rather than symmetric keys. This is discussed in the next post.

This is a third solution that works with TLS 1.2 but cannot work with TLS 1.3 due to the multiple protection state problem. It is described in a presentation made by Steve Perkins of Nubeva at the above-mentioned NIST workshop. In this solution a “symmetric key” is extracted from the memory of the TLS client and/or the TLS server and sent to a decryptor. Here is how the extraction is described in slide 7 of the presentation: “ HOW IT WORKS: Scan’s System To Discover TLS Processes; Loads Signatures with Discovery Algorithms; Triggers on “Client Hello”; Knows where Symmetric Key will be Set; Copies and Exports It;”.

But just as it is unclear in the Session Key Forwarding solution what the forwarded &ldquo:session key” could be in TLS 1.3, it is unclear in the Symmetric Key Intercept solution what the intercepted “symmetric key” could be, since in TLS 1.3 there are different traffic keys for different kinds of traffic. Furthermore, in TLS 1.3 the client and the server may generate new traffic keys for non-early application data at any time after sending their Finished messages, long after the Client Hello that is supposed to trigger the key extraction process. And to decrypt TLS 1.3 traffic the decryptor would have to know what AEAD algorithm to use for each kind of traffic, and when to transition from one algorithm to another and from one set of traffic keys to another. So the extraction process described in the presentation cannot work for TLS 1.3.

RHRD (“TLS Visibility Extension”)

In April 2018, before the publication of RFC 8446, R. Housley and R. Droms published an internet draft, informally called RHRD after the initials of the authors, proposing a TLS Visibility Extension. The draft did not progress to RFC status but I discuss it here because it has interesting similarities and differences with the ST variant of the Pomcor visibility solution introduced in the third post of this series and briefly described below.

In the RHRD solution an (EC)DH key pair called SSWrapDH1 is generated by an enterprise key manager, which provisions the public key to the server and the private key to any number of authorized decryptors. The client includes an empty TLS Visibility Extension in the ClientHello message, signaling that it is willing to allow authorized decryption. The server generates an ephemeral SSWrapDH2 key pair, computes a shared secret from the private key component of SSWrapDH2 and the public key component of SSWrapDH1, derives an encryption key Ke from the shared secret, encrypts the session's early secret and handshake secret (which RHRD refers to as “the session secrets”) under KE, and includes the public key component of SSWrapDH2 and the encryped session secrets in the ServerHello message for the benefit of the decryptors. The decryptors are then able to compute the shared secret, derive the key Ke, decrypt the session secrets, derive the traffic keys from the session secrets, and decrypt the traffic.

The main differences between the RHRD solution and the ST Variant are as follows:

Although the SSWrapDH2 key pair is ephemeral, the SSWrapDH1 key pair is static, so RHRD does not preserve PFS. By contrast the ST Variant, like the other variants of the Pomcor visibility solution, does preserve PFS.
RHRD encrypts what it calls the “session secrets” and transmits them to the decryptors, whereas ST encrypts what RFC 8446 calls “the traffic secrets” and transmits them to the visibility middlebox. As shown in the key schedule, the traffic secrets are derived from RHRD's session secrets and transcript hashes of handshakes messages, which the middlebox does not have to compute in the ST Variant.
In RHRD the public key component of the SSWrapDH2 key pair and the encrypted “session secrets” are transmitted within the ServerHello message that the server sends to the client, even though they are intended for the decryptors rather than the client. By contrast, in ST and the other variants of the Pomcor visibility solution, all communications between the server and the middlebox take place over a TCP connection different than the one used for the TLS connection, and possibly on a different wire.
In RHRD the client is aware of the TLS Visibility Extension and has to opt in, whereas the Pomcor visibility solution is a private matter between the server and the middlebox.

SD Variant of the Pomcor TLS Visibility Solution

The SD Variant was proposed in the first post of this series, for the (EC)DHE key exchange mode of TLS 1.3. It was described in more detail in the second post, and extended to all three key exchange modes of TLS 1.3 in the third post. The extension to the PSK-only and PSK + (EC)DHE modes assumes that the middlebox is provisioned with the PSK or has derived the PSK from the resumption master secret of an earlier session.

The SD Variant is based on the establishment of a Visibility Shared Secret (VSS) between the TLS server and a passive middlebox by means of a DH or ECDH key exchange over a long term TCP connection, using ephemeral key pairs. The VSS can be precomputed as described in the fourth post. The bits of the VSS are used as pseudo-random bits by the server and the middlebox to independently generate the ephemeral key pair that the server uses for the TLS 1.3 key exchange. (The middlebox only needs to generate the private key component of the key pair.) This allows the middlebox to compute the TLS shared secret on its own, derive the traffic secrets from the TLS shared secret, and derive the traffic keys from the traffic secrets.

Since the middlebox knows the PSK (if used) and the TLS shared secret, it has all the secret information needed to derive the AEAD key and IV to be used for decrypting each kind of traffic, including the initial values and the subsequently updated values of the AEAD key and IV used for decrypting the non-early data.

ST Variant of the Pomcor TLS Visibility Solution

The ST Variant was proposed in the third post of this series. It is applicable to all three key exchange modes of TLS 1.3. Like the SD Variant, it is based on the establishment of a Visibility Shared Secret (VSS), which can be precomputed, between the TLS server and a passive middlebox. It differs from the SD Variant in that the bits of the VSS are used to derive an AEAD key and AEAD nonces rather than the TLS ephemeral key pair. The AEAD key and nonces are used to encrypt the traffic secrets, which the server transmits to the middlebox over the same long term TCP connection used to agree on the VSS and the middlebox uses to generate the traffic keys. If there is early data, the server may or may not accept it. However, even if the server intends to reject the early data, it sends the early traffic secret to the middlebox, so that the middlebox can derive the early traffic keys.

The middlebox does not know the PSK (if used), nor the TLS shared secret. However, since it knows the traffic secrets, it is able to derive the AEAD key and IV needed to decrypt each kind of traffic. The initial values of the AEAD key and IV for the non-early data are derived from the initial values of the application traffic secrets, and updated values are derived from updated secrets. The middlebox retains the application traffic secrets after deriving the initial values, so the it can subsequently update the secrets and derive updated values.

The ST Variant has similarities with the RHRD solution. Differences between the ST Variant and the RHRD solution are described in detail above.

2V Variant of the Pomcor TLS Visibility Solution

The 2V Variant was proposed in the fourth post of this series. Like the SD and ST Variants, it is based on the establishment of a Visibility Shared Secret (VSS), which can be precomputed, between the TLS server and a passive middlebox. As in the ST variant, the bits of the VSS are used to derive an AEAD key and AEAD nonces. But the AEAD key and nonces are used to encrypt protection states that the server sends to the middlebox, instead of traffic secrets. Whereas traffic secrets are specific to TLS 1.3, protection states are a generic concept applicable to any version of TLS. This means that a 2V middlebox can support both current versions of TLS, 1.2 and 1.3, and should be able to support future versions of TLS that do not change the record layer.

In the 2V Variant the middlebox does not need to parse the handshake. It receives protection states from the server and passes them to two deprotection processes, one for each direction traffic. Each deprotection process decrypts TLS records as specified by a protection state until decryption fails, then it transitions to the next protection state received from the server. A 2V middlebox is thus not affected by the multiple protection state problem even though it operates at the level of the record layer and is handshake-agnostic.