Skip to content

Release: Consumer Price Indices, Trial consumer price indices using web scraped data

Released: 08 June 2015

Contact

Ainslie Woods

Prices, ONS

cpi@ons.gsi.gov.uk

Telephone: +44 (0)1633 456900

Categories: Economy, Prices, Output and Productivity, Price Indices and Inflation, Consumer Price Indices, Consumer Prices Index, Retail Prices Index

Frequency of release: Monthly

Language: English

Geographical coverage: UK

Geographical breakdown: Country

In this release

Trial consumer price indices using web scraped data (611.7 Kb ZIP)

Overview

In early 2014 the Office for National Statistics (ONS) Big Data team started developing automated tools to collect prices from retailers' websites. The tools read the underlying HTML code and use the structure to identify price information. This is known as web scraping. Since April 2014 the team has scraped daily prices from three large supermarkets. They have collected over 1.5 million price quotes for 35 grocery products in 11 months, providing a wide breadth of accessible price information. The ONS Prices division is currently exploring alternative data sources, to better understand their potential in improving consumer price indices.

This publication presents ONS's first research into constructing price indices from web scraped data. We consider two ways of combining high frequency price data. Daily bilateral chained indices are produced. We also produce unit price fixed base indices at different frequencies. The web scraped data have also been used to produce an index which follows the Consumer Prices Index (CPI) methodology as closely as possible. This is intended to allow direct comparisons to be made with the CPI. For this reason a special aggregate version of the CPI has also been produced. None of these indices are intended to represent a true picture of inflation. There remains much work to be done to develop a robust methodology, as well as robust cleaning processes for web scraped data. They are, instead, intended to offer an insight into the collection method, and the issues that can arise when constructing indices from high frequency data, as well as the potential benefits that large data sources of this nature can offer. We present this research to invite comments on the data and methodology, and to gauge interest in high frequency price data.

Content from the Office for National Statistics.
© Crown Copyright applies unless otherwise stated.