Recently the World Wide Web Consortium (W3C) launched a Tracking Protection Working Group, following several recent proposals for Do-Not-Track mechanisms, and more specifically in response to a W3C-member submission by Microsoft. A useful list of links to proposals and discussions related to Do-Not-Track can be found in the working group’s home page.
The Microsoft submission was concerned with tracking by third-party content embedded in a Web page via cookies and other means of providing information to the third party. It proposed a Do-Not-Track setting in the browser, to be sent to Web sites in an HTTP header and made available to Javascript code as a DOM property. It also proposed a mechanism allowing the user to specify a white list of third party content that the browser would allow in a Web page and/or a black list of third party content that the browser would block. The browser would filter the requests made by a Web page for downloading third-party content, allowing some and rejecting others.
(The specific filtering mechanism proposed by Microsoft would allow third-party content that is neither in the white list nor in the black list. This would be ineffective, since the third party could periodically change the domain name it uses to avoid being blacklisted. I trust that the W3C working group will come up with a more effective filtering mechanism.)
A Do-Not-Track setting and a filtering mechanism are good ideas, but they only deal with the traditional way of tracking a user. Today there is another way of tracking a user, which can be used whenever the user logs in to a Web site with authentication provided by a third party, such as Facebook, Google or Yahoo.
Third-party login uses a double-redirection protocol. When the user wants to log in to a Web site, the user’s browser is redirected to a third party, which plays the role of “identity provider.” The identity provider authenticates the user and redirect the browser back to the Web site, which plays the role of “relying party.” The identity provider is told who each relying party is, and can therefore can track the user without any need for cookies. The identity provider can link the user’s logins to relying parties to the information in the user’s account at the identity provider, which in the case of Facebook includes the user’s real name and much other real identity information.
Privacy-enhancing technologies, which I discussed in a recent series of blog posts (starting with the one on U-Prove), may eventually make it possible to log in with a third party credential without the identity provider being able to track the user; but in the meantime, means must be found of providing protection against tracking via third-party login. The W3C Tracking Protection working group could provide such protection by broadening the scope of the Do-Not-Track setting so that it would apply to both the traditional method of tracking via embedded content and the new method of tracking via third-party login. An identity provider who receives a Do-Not-Track header while participating in a double-redirection protocol would be required to forget the transaction after authenticating the user.
The scope of the filtering mechanism could also be broadened so that it would apply to redirection requests in addition to third-party content embedding. This could mitigate a security weakness that affects third-party login protocols such as OpenID and OAuth. Such protocols are highly vulnerable to a phishing attack that captures the user’s password for an identity provider: the attacker sets up a malicious relying party that redirects the browser to a site masquerading as the identity provider. A filtering mechanism that would block redirection by default could prevent the attack based on the fact that the site masquerading as the identity provider would not be whitelisted (while the legitimate identity provider would be).