Skip Navigation
Search site

The NHS.UK blog has moved home. You can now find us at

Can you help us improve the NHS Digital site?
Please take this short survey.

The processing cycle and HES data quality

HES data comes from the routine exchanges of information between providers and commissioners of healthcare for NHS patients in England. Healthcare providers collect administrative and clinical information locally to support the care of the patient. The data is submitted to the Secondary Uses Service (SUS), which, as well as making it available to the commissioners, also copies the information to a database.

At pre-arranged dates during the year, SUS takes an extract from their database and sends it to HES. We then validate and clean the extract, before deriving new items and making the information available in the data warehouse. Data quality reports and checks are completed at various stages in the cleaning and processing cycle. 

pdf icon The HES processing cycle and HES data quality [342kb]

pdf icon Data quality checks performed on SUS and HES data [417kb]

HES Data Quality Notes

The HES Data Quality Notes highlight any specific known issues with the data to be considered when analysing the data. They are designed for HES system users and those requesting extracts. This is a single repository and replaces individual DQ notes published every month.

xls icon HES Data Quality Notes [3Mb]

Latest publication period: Month 3 - June 2016

Latest publication date: 17 August, 2016

This file accompanies monthly provisional and annual HES data for the Accident and Emergency (A&E), Outpatient (OP) and Admitted Patient Care (APC) datasets.

Access the monthly HES data publication reports.

Automatic data cleaning and derivation rules

Read how we clean the data to improve the value and quality of HES data. These rules are used to:

  • clean common and obvious data quality errors
  • derive additional data items to populate the HES data set

The document and cleaning rule numbering should be used in conjunction with the HES User Data Dictionary. 

pdf icon Admitted Patient Care cleaning rules [659kb]

pdf icon Outpatient cleaning rules [381kb]

pdf icon A & E cleaning rules [337kb]

Provider Mapping Methodology

Information on how we handle records with an invalid provider code within the HES datasets.

pdf icon HES Provider Mapping Methodology [238kb]

Duplicate Methodology

Information on how we identify and handle duplicate records within the HES dataset.

pdf icon HES Duplicate Identification and Removal Methodology [267kb]

HES patient ID

The HES Patient ID (HES  ID) provides a way of tracking patients through the HES database without identifying them. It is central to many HES outputs including spell construction, emergency readmissions and linkage to other data sets, such as mortality.

pdf icon Read about the HES ID and its methodolgy [387kb]

Examples of how we use automatic data cleaning and derivation rules:

To clean common and obvious data quality errors

Rule #0150 looks for evidence where a Birth Episode (CDS type 120) has been incorrectly submitted to SUS as a General Episode (CDS type 130). If evidence is found then the Episode Type of the record is altered to reflect this.

Without this clean, the number of birth records (Episode Type 3) in HES would tend to be lower than the actual number of births taking place.

To derive additional data items to populate the HES data set

Rule #1200 uses the postcode from each submitted CDS record to derive additional geographical data items relating to the episode of care. HES uses reference data from the ONS Postcode Directory to derive data items such as Parliamentary Constituency or Strategic Health Authority of the patient's residence. This allows record level data to be easily aggregated to enable effective spatial analysis to be performed.

Close iCM Form