|
home | strategy development | strategy
skills | site
index | access
keys
Analysing data - Modelling
>
in practice
Modelling is a very useful analytical tool that aims to
establish formal mathematical relationships between variables. Models can
take a variety of forms, and it is important to select the right model for
the circumstances:
- In some situations, the variables and issues of interest can be
narrowly and tightly defined, in which case the model should in turn
be narrow in its coverage, but detailed within its boundaries.
- In other circumstances, variables and issues of interest may go much
wider (e.g. impact on the whole economy), in which case the model will
inevitably be less detailed, but with much wider coverage.
Another choice to be made will be with regard to the
degree of quantification. Is it necessary to determine the amount of an
impact, and can the data tell us this information? Or is a qualitative
indication of impact (i.e. direction of effect) sufficient?
Once the right type and level of model has been
selected, the key is then to understand the model's structure:
- If the modelling work is going to be carried out in-house, an
appropriate functional form will need to be decided and the necessary
data collected.
- Models will often be "bought in" from outside, rather than
developed in-house. But this should never be an excuse for simply
treating them as a "black box", without understanding what
makes them tick. It is vital to understand why/how models produce the
results they do, always ask: Which variables in the model are driving
the results obtained?
In either case, it will be important to get a good feel
for the key determinants of the model's results, so that they can be
used appropriately and intelligently. For example, is the model based on
relationships estimated on historic data? Or does it use survey data? To
what extent does it incorporate behavioural change?
Modelling Tips
- Modelling is not just data mining, it needs to be based on
theoretical foundations.
- Sensitivity analysis (i.e. assessing the impact of varying
assumptions or variables) is useful in understanding what drives a
model's results.
- Need clarity about what is endogenous and what is exogenous to the
model.
- A rich data set is needed to construct a robust model.
- Modelling can be very time and resource intensive - hence the
likelihood of choosing to buy-in existing models.
Excel Modelling
Modelling in Excel is like any other piece of analysis
- you require a clear understanding of the questions at hand, a vision of
the output, a good plan to get there, time to work through the plan to
completion and the ability to package the analysis for review. Failure to
do so will almost certainly result in the need for rework, lost time and
frustration.
There are a number of steps, which if followed, will
assist in creating a successful Excel model.#
Step One: Define the problem explicitly
Ask yourself:
- What are the questions I need to answer?
- Conceptually, how do I answer each of the questions?
- What will the final output look like?
Step Two: Understand the audience
Determine who the target audience will be and keep in
mind their background when preparing data to share. Pre-empting your
audience and their needs, and designing Excel and other output to suit
those needs will save 'low-value' added time repackaging output.
Step Three: Design, don't type
Having now envisaged the output and understood your
audience, think about how to design your Excel analysis to best meet those
aspirations:
- Spend the time up front to design the spreadsheet
- If necessary, write a brief work plan
- Ask yourself: how accurate does the analysis have to be? How long do
I have to generate the model?
- Design the spreadsheet workbook
- Always have an assumptions sheet, this will help with
sensitivity analysis
- Make other sheets flow logically from the assumptions sheet
- Sketch out a classification of variables:
- Static variable: variable that is unlikely to change.
- Dynamic variable: variable that you do not know accurately and you
are likely to want to test the sensitivity to a range of the variable
- Calculated fields: variables that are derived as a direct result of
static and dynamic variables
First, it will be useful to classify variables
according to type which will then help in writing the model, for example:
|
Variable |
Type |
|
Household density
Store reach
Number of households per store
Household penetration
Annual spend per customer
Annual revenue per store
Gross margin
Annual fixed costs
Annual profits
Initial investment
NPV period
Discount rate
NPV
|
Dynamic
Static
Calculated
Dynamic
Static
Calculated
Static
Static
Calculated
Static
Static
Static
Calculated |
Secondly, laying-out a workbook design will save you
time in writing the model in Excel. In general, Excel workbooks should
follow this generic design:

Step Four: Document your spreadsheets
It is very important to document the model as it is
built. Such documentation will include information about sources of data
or estimates, comments on non-obvious calculations and anything else
pertinent. Any one looking at the model tomorrow or in six months time
will find good documentation on the spreadsheets extremely useful.
The basic rule here is that your spreadsheets should be
self-documenting as much as possible. One way to test whether you have
sufficient documentation as you go along is to ask the question "If
my team leader had to take over my analysis tomorrow, could they
understand what I've done?".
There are two major alternatives for documenting
spreadsheets, using either:
- An additional Excel column
- Comments attached to the cells
|
Alternative |
Advantages |
Disadvantages |
|
Excel column |
Easy to see, so when an assumption changes more likely to change
the note as well
Documents the spreadsheet when printed - can hide the column/set
out of print area not required |
Can feel intrusive - gets in the way
Easy to forget to update columns when hidden |
|
Comment |
Discreet - less intrusive when using the spreadsheet |
Easy to forget to update |
In some cases, it may also be necessary to write additional documentation
in MS Word or MS PowerPoint.
Step Five: Use variables, never hardcode
- By defining the variables up front and create the assumptions sheet
you will have fewer problems when changing variable values
- As you type a number into any formula, ask yourself:
- "Is this number likely to change...ever?"
- "Will I know what the number refers to in a year's time?"
- Rarely is it beneficial to hardcode variables into formulae.
- Do not paste values as this significantly diminishes your audit trail.
Step Six: Check answers - do they make sense?
Having built your model and produced some answers, don't
show them yet to anyone. Instead, stop and sanity check them yourself. Ask
yourself these questions:
- Is the answer what you would expect?
- Is this what your audience would expect?
- If not, what drives the different answer - can you explain the
differences to yourself and your audience?
- Are the units correct?
Step Seven: Let Excel make your life easier
Microsoft Excel has a number of features that make
analysis easier to navigate and complete:
- Sorting
- Functions
- Aggregation (SUM, AVERAGE, MIN, MAX)
- Conditional (IF, AND, OR, Nested IF)
- Lookup Values (VLOOKUP, HLOOKUP)
- SUMPRODUCT
- Table command
- Financial functions
- Manipulating data strings
- Using formulas (LEN, LEFT, RIGHT, MID, SEARCH, TRIM, CONCATENATE)
- Converting text to columns
- Formatting cells
- Pivot tables
- Conditional formatting.
Step Eight: Understand the sensitivities
Having built the model and development output,
understand the sensitivities of the output to key input variables. To do
this, undertake three key steps:
- Determine the range of valid values for each variable
- Test impact by changing each variable on its own
- Test impact by changing combinations of variables.
A good first pass of the two tests is to change the
values of the variables in the assumptions sheet to their maximums and
minimums.
Step Nine: Presenting the Results
Much of the impact of analysis can be lost if it is
presented badly. In particular, complex modelling with many variables and
sub-analyses can easily become confusing and lose credibility unless
presented logically and sequentially. Think very carefully about the story
the analysis tells. Transparency is crucial, as much discussion will be
held over assumptions in the model.
A typical presentation to explain an Excel model would
cover:
- Overall objective of the model
- What the model can and can't do
- A schematic overview of how the model works
- The key data sources
- How logic of how the variables are combined to produce the outputs
- The key inputs, the value of each and the rationale for this value
- The key assumptions, the value of each and the rationale for this
value.
Econometric Modelling
This is the application of mathematical and statistical
techniques to economic and social problems. Econometric studies proceed by
formulating a mathematical model, then, using the best data available,
statistical methods are used to obtain estimates of the parameters in the
model. Methods of statistical interference are then used to decide whether
the hypothesis underlying the model can be rejected or not. Econometrics
is thus concerned with testing the validity of economic and social
theories and providing the means of making quantitative predictions.
Regression analysis is a major tool of econometrics. It
permits different hypotheses to be tested about the forms of the
relationship and the variables that should be included in it.
Regression analysis is used to determine relationships
between variables that analysts believe intuitively to be related. Once a
relationship is established, it can be used to forecast the outcome. In
business, regression analysis is often used to examine the relationship
between:
- Sales, price, promotion and market factors
- Production costs to production volumes
It is equally useful to policy makers as illustrated in
the 'in practice' example. See BMJ.com
for more detailed guidance on how
to use these techniques.
Strengths
- There are a variety of different tools with which to conduct data
analysis. The key is to keep focused on the specific question/task, and
not allow the focus of the analysis to stray.
- Modelling can provide a clear structure for the analysis, which can
help create buy-in to the process from other government departments.
- Modelling allows examination of a range of factors, all operating at
once. It investigates the strength of these factors and their
interaction, and generates robust quantitative evidence.
Weaknesses
- A strong end-user focus is needed to avoid becoming too bogged down in
technical issues.
- Analysis may suffer from a lack of available data.
- Modelling work often has to deal with numerous uncertainties
surrounding data and assumptions. A pitfall to be avoided is to try and
hide these weaknesses within coding in the model to try and make the
results appear more robust. It is important to be transparent about all
the data and assumptions, and to be aware of the degree of accuracy
required by the results in order to reach a robust conclusion.
References
The
Green Book, HMT. Appraisal and Evaluation in Central Government
Derek Rowntree, Statistics Without Tears
Sprent P, Statistics in Action
Statistics for Economists, R.E.Beals
Statistics for Economics, R E Davies and J N Foad
Cambridge Econometrics - Modelling for Government
Multi-Criteria Analysis: A Manual (DTLR)
Analysing data - Modelling
In Practice: SU Ethnic Minorities and the Labour
Market Project
Understanding the drivers of performance and
progression in the workplace was crucial to the SU's 'Ethnic Minorities
and the Labour Market' project. Equally important, however, was gaining
an understanding of the various factors impeding success in the workplace,
from educational under-attainment and socio-economic disadvantage to
residence in inner urban areas and limited experience in the labour
market.
The team used regression analysis to compare the
relative strength of each of these, and other, factors in accounting for
the disadvantage experienced by Britain's ethnic minority groups. A
number of regression models were used, each of which took account of a
different combination of these conditioning factors. An examination of
several of the models led the team to conclude that:
- ethnic minorities remain disadvantaged in terms of employment and
occupational attainment even after key variables are taken into
account. Some groups are clearly even more disadvantaged than gross
differences suggest, given their educational qualifications or other
characteristics;
- ethnic minority men have been persistently disadvantaged in terms of
earnings. British-born ethnic minority women appear to be no longer
disadvantaged in terms of earnings, though their foreign-born peers
continue to be disadvantaged;
- Indian men are consistently the least disadvantaged among ethnic
minority groups; and
- Pakistani, Bangladeshi and Black men and women are consistently
among the most disadvantaged.
The most important conclusion that emerges from these
analyses is that, even after accounting for key variables, all ethnic
minority groups are disadvantaged relative to Whites in comparable
circumstances. The figures below illustrate this fact, showing the
earnings and unemployment risk of ethnic minority men relative to their
White peers, before ('Actual') and after ('Like-for-like') taking into
account factors such as age, education, recency of migration, economic
environment and family structure. Together, these variables can explain
just £10 of the £116 wage gap between Blacks and Whites.
Figure 1: Weekly Male Earnings Relative to White
Counterpart

Source: R. Berthoud 'Ethnic Employment Penalties in
Britain', Journal of Ethnic and Migration Studies, 26:389-416,
2000.
Note: Figure combines the effects of unemployment and
of pay
Figure 2: Male Unemployment Risk Relative to White
Counterpart

Source: F. Carmichael and R. Woods, 'Ethnic
penalties in unemployment and occupational attainment: evidence for
Britain'. International Review of Applied Economics 14: 71-98, 2000.
It follows from such analysis that a range of other
explanatory factors must be at work. These may include: degree of
assimilation; cultural/religious factors; business opportunities in
the areas where ethnic minorities live; Government infrastructure in
local regions; quality and location of housing; access to childcare;
quality of, and willingness to use, transportation to access
employment opportunities; levels of, or access to, social capital; and
employer discrimination. However, due to the absence of quantitative
measures for such factors, in statistical terms we are left with an
incomplete picture of their relative weight.
|
^ top
|