By Simon Rice, Group Manager for Technology
Central to getting the right functionality can be coordinating a range of sources to pull in content. That can mean connections to many third party sources such as social media, weather, adverts and news feeds. Developers are also able to link to code libraries, such as jQuery or font definitions hosted on third-party websites.
Problems can arise when the first party website leaks personal data to third-party sites by mistake. A simple example of how this can happen is through the HTTP referer header. Each time a user sends a request for a webpage, the browser will normally send the URL of the webpage they were previously viewing.
For example, if a user visits:
Then the HTTP request sent by the browser will inform googleapis.com which webpage the user was viewing.
HTTP referer headers can be logged by websites, as part of their efforts to understand how users can navigate around their website, or know where they have come from (search engine, social media/blog post etc).
As a result this can pose a risk to privacy as names, emails, unique identifiers and location data can be leaked unnecessarily to third-parties, as a result of mistakes or poor-coding practices in websites and mobile apps.
For example, data leakage can occur in the password reset process when a reset code is emailed to the user’s registered email address (eg http://www.example.com/password-reset?code=f0495318348888bd34f7e00ff8688b53c138d4f6c837faf58b4c9e217569ecf8). The user then needs to click the link and choose a new password.
If the password reset page includes third-party content then this code can be leaked via the HTTP referer. This means that the third-party site can receive a copy of the user’s password reset code. Even if the third-party site didn’t realise that the information was received there is still a window of opportunity when the password reset code could be hijacked and misused. This would be made more serious if the reset code takes a long time to expire or fails to expire completely.
What needs to be done?
Eagle-eyed readers will already have noticed that the password reset link in the example above does not use HTTPS. Providing a non-encrypted protocol for a password reset link such as this would not be good practice as it means the data can be intercepted by anyone monitoring the communication.
Using HTTPS would have an added benefit because the browser wouldn’t include the HTTP referer header to any non-HTTP sites. But by itself it is not enough to completely stop the leakage of information. Whilst it is possible for users to configure their browsers not to send HTTR referers (eg by using browser add-ons or extensions, the best course of action would be for developers to avoid the use of such information in the URL or to mitigate the risks when this is not be possible.
Amongst the range of options is to include the recently added referrer meta tag (yes with 2 r’s this time). Pages which include <meta name=”referrer” content=”never”> will indicate to the browser to never send an HTTP referer header.
Another option would be to remove any third-party content from secure elements of your website, such as the password reset page.
There are other more complex options for stripping out referer headers but you should always ensure that whatever method you use it is rigorously tested before launch, as well as periodically for performance, vulnerabilities and other issues. We recommend that a series of tests are added to those that you already conduct for highlight leaks of personal information.
|Simon Rice is the Group Manager for the Technology team which provides technical expertise to all ICO departments in order to support the broad range of activities undertaken by the ICO.|