This snapshot, taken on 04/06/2007, shows web content selected for preservation by The National Archives. External links, forms and search boxes may not work in archived websites.
Strategy Unit logo

Strategy Survival Guide

Prime Minister's Strategy Unit

Version 2.1

Strategy SkillsBuilding an Evidence Base

Analysing data - Modelling

Modelling is a very useful analytical tool that aims to establish formal mathematical relationships between variables. Models can take a variety of forms, and it is important to select the right model for the circumstances:

  • In some situations, the variables and issues of interest can be narrowly and tightly defined, in which case the model should in turn be narrow in its coverage, but detailed within its boundaries.
  • In other circumstances, variables and issues of interest may go much wider (e.g. impact on the whole economy), in which case the model will inevitably be less detailed, but with much wider coverage.

Another choice to be made will be with regard to the degree of quantification. Is it necessary to determine the amount of an impact, and can the data tell us this information? Or is a qualitative indication of impact (i.e. direction of effect) sufficient?

Once the right type and level of model has been selected, the key is then to understand the model's structure:

  • If the modelling work is going to be carried out in-house, an appropriate functional form will need to be decided and the necessary data collected.
  • Models will often be "bought in" from outside, rather than developed in-house. But this should never be an excuse for simply treating them as a "black box", without understanding what makes them tick. It is vital to understand why/how models produce the results they do, always ask: Which variables in the model are driving the results obtained?

In either case, it will be important to get a good feel for the key determinants of the model's results, so that they can be used appropriately and intelligently. For example, is the model based on relationships estimated on historic data? Or does it use survey data? To what extent does it incorporate behavioural change?

Modelling Tips

  • Modelling is not just data mining, it needs to be based on theoretical foundations.
  • Sensitivity analysis (i.e. assessing the impact of varying assumptions or variables) is useful in understanding what drives a model's results.
  • Need clarity about what is endogenous and what is exogenous to the model.
  • A rich data set is needed to construct a robust model.
  • Modelling can be very time and resource intensive - hence the likelihood of choosing to buy-in existing models.
Excel Modelling

Modelling in Excel is like any other piece of analysis - you require a clear understanding of the questions at hand, a vision of the output, a good plan to get there, time to work through the plan to completion and the ability to package the analysis for review. Failure to do so will almost certainly result in the need for rework, lost time and frustration.

There are a number of steps, which if followed, will assist in creating a successful Excel model.#

Step One: Define the problem explicitly

Ask yourself:

  • What are the questions I need to answer?
  • Conceptually, how do I answer each of the questions?
  • What will the final output look like?

Step Two: Understand the audience

Determine who the target audience will be and keep in mind their background when preparing data to share. Pre-empting your audience and their needs, and designing Excel and other output to suit those needs will save 'low-value' added time repackaging output.

Step Three: Design, don't type

Having now envisaged the output and understood your audience, think about how to design your Excel analysis to best meet those aspirations:

  • Spend the time up front to design the spreadsheet
  • If necessary, write a brief work plan
  • Ask yourself: how accurate does the analysis have to be? How long do I have to generate the model?
  • Design the spreadsheet workbook
  • Always have an assumptions sheet, this will help with sensitivity analysis
  • Make other sheets flow logically from the assumptions sheet
  • Sketch out a classification of variables:
    • Static variable: variable that is unlikely to change.
    • Dynamic variable: variable that you do not know accurately and you are likely to want to test the sensitivity to a range of the variable
    • Calculated fields: variables that are derived as a direct result of static and dynamic variables

First, it will be useful to classify variables according to type which will then help in writing the model, for example:

Variable

Type

Household density
Store reach
Number of households per store
Household penetration
Annual spend per customer
Annual revenue per store
Gross margin
Annual fixed costs
Annual profits
Initial investment
NPV period
Discount rate
NPV

Dynamic
Static
Calculated
Dynamic
Static
Calculated
Static
Static
Calculated
Static
Static
Static
Calculated

Secondly, laying-out a workbook design will save you time in writing the model in Excel. In general, Excel workbooks should follow this generic design:

generic excel workbook design

Step Four: Document your spreadsheets

It is very important to document the model as it is built. Such documentation will include information about sources of data or estimates, comments on non-obvious calculations and anything else pertinent. Any one looking at the model tomorrow or in six months time will find good documentation on the spreadsheets extremely useful.

The basic rule here is that your spreadsheets should be self-documenting as much as possible. One way to test whether you have sufficient documentation as you go along is to ask the question "If my team leader had to take over my analysis tomorrow, could they understand what I've done?".

There are two major alternatives for documenting spreadsheets, using either:

  • An additional Excel column
  • Comments attached to the cells

Alternative

Advantages

Disadvantages

Excel column

Easy to see, so when an assumption changes more likely to change the note as well

Documents the spreadsheet when printed - can hide the column/set out of print area not required

Can feel intrusive - gets in the way

Easy to forget to update columns when hidden

Comment

Discreet - less intrusive when using the spreadsheet

Easy to forget to update


In some cases, it may also be necessary to write additional documentation in MS Word or MS PowerPoint.

Step Five: Use variables, never hardcode

  • By defining the variables up front and create the assumptions sheet you will have fewer problems when changing variable values
  • As you type a number into any formula, ask yourself:
  • "Is this number likely to change...ever?"
  • "Will I know what the number refers to in a year's time?"
  • Rarely is it beneficial to hardcode variables into formulae.
  • Do not paste values as this significantly diminishes your audit trail.

Step Six: Check answers - do they make sense?

Having built your model and produced some answers, don't show them yet to anyone. Instead, stop and sanity check them yourself. Ask yourself these questions:

  • Is the answer what you would expect?
  • Is this what your audience would expect?
  • If not, what drives the different answer - can you explain the differences to yourself and your audience?
  • Are the units correct?

Step Seven: Let Excel make your life easier

Microsoft Excel has a number of features that make analysis easier to navigate and complete:

  • Sorting
  • Functions
    • Aggregation (SUM, AVERAGE, MIN, MAX)
    • Conditional (IF, AND, OR, Nested IF)
    • Lookup Values (VLOOKUP, HLOOKUP)
    • SUMPRODUCT
    • Table command
    • Financial functions
  • Manipulating data strings
  • Using formulas (LEN, LEFT, RIGHT, MID, SEARCH, TRIM, CONCATENATE)
  • Converting text to columns
  • Formatting cells
  • Pivot tables
  • Conditional formatting.

Step Eight: Understand the sensitivities

Having built the model and development output, understand the sensitivities of the output to key input variables. To do this, undertake three key steps:

  • Determine the range of valid values for each variable
  • Test impact by changing each variable on its own
  • Test impact by changing combinations of variables.

A good first pass of the two tests is to change the values of the variables in the assumptions sheet to their maximums and minimums.

Step Nine: Presenting the Results

Much of the impact of analysis can be lost if it is presented badly. In particular, complex modelling with many variables and sub-analyses can easily become confusing and lose credibility unless presented logically and sequentially. Think very carefully about the story the analysis tells. Transparency is crucial, as much discussion will be held over assumptions in the model.

A typical presentation to explain an Excel model would cover:

  • Overall objective of the model
  • What the model can and can't do
  • A schematic overview of how the model works
  • The key data sources
  • How logic of how the variables are combined to produce the outputs
  • The key inputs, the value of each and the rationale for this value
  • The key assumptions, the value of each and the rationale for this value.
Econometric Modelling

This is the application of mathematical and statistical techniques to economic and social problems. Econometric studies proceed by formulating a mathematical model, then, using the best data available, statistical methods are used to obtain estimates of the parameters in the model. Methods of statistical interference are then used to decide whether the hypothesis underlying the model can be rejected or not. Econometrics is thus concerned with testing the validity of economic and social theories and providing the means of making quantitative predictions.

Regression analysis is a major tool of econometrics. It permits different hypotheses to be tested about the forms of the relationship and the variables that should be included in it.

Regression analysis is used to determine relationships between variables that analysts believe intuitively to be related. Once a relationship is established, it can be used to forecast the outcome. In business, regression analysis is often used to examine the relationship between:

  • Sales, price, promotion and market factors
  • Production costs to production volumes

It is equally useful to policy makers as illustrated in the 'in practice' example. See BMJ.com for more detailed guidance on how to use these techniques.

Strengths
  • There are a variety of different tools with which to conduct data analysis. The key is to keep focused on the specific question/task, and not allow the focus of the analysis to stray.
  • Modelling can provide a clear structure for the analysis, which can help create buy-in to the process from other government departments.
  • Modelling allows examination of a range of factors, all operating at once. It investigates the strength of these factors and their interaction, and generates robust quantitative evidence.
Weaknesses
  • A strong end-user focus is needed to avoid becoming too bogged down in technical issues.
  • Analysis may suffer from a lack of available data.
  • Modelling work often has to deal with numerous uncertainties surrounding data and assumptions. A pitfall to be avoided is to try and hide these weaknesses within coding in the model to try and make the results appear more robust. It is important to be transparent about all the data and assumptions, and to be aware of the degree of accuracy required by the results in order to reach a robust conclusion.
References

The Green Book, HMT. Appraisal and Evaluation in Central Government

Derek Rowntree, Statistics Without Tears

Sprent P, Statistics in Action

Statistics for Economists, R.E.Beals

Statistics for Economics, R E Davies and J N Foad

Cambridge Econometrics - Modelling for Government

Multi-Criteria Analysis: A Manual (DTLR)

Analysing data - Modelling

In Practice: SU Ethnic Minorities and the Labour Market Project

Understanding the drivers of performance and progression in the workplace was crucial to the SU's 'Ethnic Minorities and the Labour Market' project. Equally important, however, was gaining an understanding of the various factors impeding success in the workplace, from educational under-attainment and socio-economic disadvantage to residence in inner urban areas and limited experience in the labour market.

The team used regression analysis to compare the relative strength of each of these, and other, factors in accounting for the disadvantage experienced by Britain's ethnic minority groups. A number of regression models were used, each of which took account of a different combination of these conditioning factors. An examination of several of the models led the team to conclude that:

  • ethnic minorities remain disadvantaged in terms of employment and occupational attainment even after key variables are taken into account. Some groups are clearly even more disadvantaged than gross differences suggest, given their educational qualifications or other characteristics;
  • ethnic minority men have been persistently disadvantaged in terms of earnings. British-born ethnic minority women appear to be no longer disadvantaged in terms of earnings, though their foreign-born peers continue to be disadvantaged;
  • Indian men are consistently the least disadvantaged among ethnic minority groups; and
  • Pakistani, Bangladeshi and Black men and women are consistently among the most disadvantaged.

The most important conclusion that emerges from these analyses is that, even after accounting for key variables, all ethnic minority groups are disadvantaged relative to Whites in comparable circumstances. The figures below illustrate this fact, showing the earnings and unemployment risk of ethnic minority men relative to their White peers, before ('Actual') and after ('Like-for-like') taking into account factors such as age, education, recency of migration, economic environment and family structure. Together, these variables can explain just £10 of the £116 wage gap between Blacks and Whites.

Figure 1: Weekly Male Earnings Relative to White Counterpart

Figure 1: Weekly Male Earnings Relative to White Counterpart

Source: R. Berthoud 'Ethnic Employment Penalties in Britain', Journal of Ethnic and Migration Studies, 26:389-416, 2000.

Note: Figure combines the effects of unemployment and of pay

Figure 2: Male Unemployment Risk Relative to White Counterpart

Figure 2: Male Unemployment Risk Relative to White Counterpart

Source: F. Carmichael and R. Woods, 'Ethnic penalties in unemployment and occupational attainment: evidence for Britain'. International Review of Applied Economics 14: 71-98, 2000.

It follows from such analysis that a range of other explanatory factors must be at work. These may include: degree of assimilation; cultural/religious factors; business opportunities in the areas where ethnic minorities live; Government infrastructure in local regions; quality and location of housing; access to childcare; quality of, and willingness to use, transportation to access employment opportunities; levels of, or access to, social capital; and employer discrimination. However, due to the absence of quantitative measures for such factors, in statistical terms we are left with an incomplete picture of their relative weight.


^ top