This snapshot, taken on
08/06/2009
, shows web content acquired for preservation by The National Archives. External links, forms and search may not work in archived websites and contact details are likely to be out of date.
 
 
The UK Government Web Archive does not use cookies but some may be left in your browser from archived websites.

Pedalling some raw data…

by Paul Clarke

Planning a cycle route? Would it be useful to know where accidents-involving-bicycles have been in the past? Following the principle of “make the raw data available, and let others use it” - here’s a new data set.

Click HERE to link to the data.

The information is provided by the Department for Transport, and the first of what we hope will be many more raw data sets provided through this innovate site. At the moment it’s just one file - but it will go straight into our data wiki which should be ready very soon.

The data gives the locations (for the years 2005-2007) of accidents involving pedal cycles, causing personal injury, which were reported to the police. Are there ‘hot-spots’? Any trends over time? Could this support a “plan a safer journey” service? What about helping to draw attention to the need for road improvements? Over to you to explore some of the answers…

Further information on road accident statistics (including scope, definitions and limitations) can be found in the Road Casualties Great Britain report (2007 is the latest year available).

10 March 2009 4:59 PM

37 Responses to “Pedalling some raw data…”

  1. Karl says:

    This is fantastic news, with the increase in cars on the road inevitably accidents involving cars and cyclists will rise. The last thing we need is less people cycling - it can be daunting even for experienced road users like myself to get on my bike. We seriously need to encourage cyclists as i believe they are the future if city travel.
    Thanks
    Karl

  2. P Fox says:

    The email address on the cycle accident spreadsheet is wrong.

    Can you tell me what the full dataset from which the pedal cycle spreadsheet is
    extracted is. What other information is available in this dataset.

    BTW : As it stands the spreadsheet is pointless and only encourages yet more
    regression towards the mean “improvements”.

  3. Paul Battley says:

    This is interesting, useful data: as a cyclist, I’m really interested in knowing where to avoid. If no one beats me to it, I’ll be plotting these points on a map to identify blackspots.

    One thing is really disappointing, though, and that’s the choice of a proprietary format (Microsoft Excel) for delivering the data. Please use open formats!

    I’ve repackaged the data as a set of CSV files as an example of a better way to do it: you can find it at http://po-ru.com/files/pedal-cycle-accident-locations.zip

  4. Tom Taylor says:

    Thanks a lot for this data, it’s very interesting. I’ve republished it in KML format, available on my blog.

  5. Richard Pope says:

    Pretty please setup data.gov.uk for this kind of thing.

  6. Andy Mabbett says:

    @Tom Taylor: Thanks for the KML, the most useful format (I echo concerns about using proprietary formats). Viewed on a map, the data is fascinating - there’s a nasty cluster around Wolverhampton, for instance, and, pleasingly, none in Birmingham.

    This data will be more useful if regular - if not ‘real time’ RSS - updates are available. What plans are there, in that direction?

    It would be a good idea to include explicit (preferably “Creative Commons”) licence terms.

  7. @P Fox:

    “BTW : As it stands the spreadsheet is pointless and only encourages yet more regression towards the mean “improvements”.”

    Explain?

  8. @Karl: “This is fantastic news, with the increase in cars on the road inevitably accidents involving cars and cyclists will rise.”

    Actually, at least in West Sussex, cycling is on the increase, and car traffic growth is zero for the first time in decades (due to the recession, no doubt). Studies from all over the world show that more people cycling leads to safer cycling conditions :)

    The main problem with this data is the lack of detail. We don’t know which locations mark a cyclist being injured, or which mark a cyclist causing injury (perhaps to themselves!). There is no indication of severity either - a few are probably deaths, but we don’t know which ones.

    Another problem is that the dangers indicated are much more likely to be due to actions of the human beings involved, and much less likely to be due to a feature of the location in question. In other words the crash is more likely to be due to human error than anything particularly dangerous at that location. Focussing on locations may well lead to erroneous conclusions about cycle safety being drawn.

    And then there’s the different rates of reporting: are all accidents reported consistently across the country? Perhaps Wolverhampton police tend to record more cycle accidents than the Birmingham police do?

    But nice to see public access to this sort of data, it can only generate useful debate. Pity it’s probably all OS-derived, and hence requires the payment of £5,000 per annum if you want to display it on the web. Ooops!

  9. @Andy Mabbett - I see markers being more dense wherever there’s high population density: Birmingham has as many as Wolverhampton, at least in 2007.

    We probably need to take into account traffic density when looking at marker density.

    It’s also interesting to look at roads that have and don’t have accidents. Some with accidents are quiet residential ones, some without accidents are busy A-roads. Without knowing corresponding cyclist and motorist traffic numbers it’s sadly impossible to use this data to label a road as “dangerous for cycling”.

    Might be useful for trend analysis, though, especially in cycling demonstration towns.

  10. Sam Prince says:

    Picking up on Anthony Cartmell’s point about reporting rates: assuming maps.google.com is displaying the data properly it seems that nobody had a cycling accident in Bolton in 2005, 2006 or 2007. Meanwhile neighbouring towns all had rashes of them. Conclusions:
    Nobody cyles in Bolton OR
    Cyclists never get hurt in Bolton OR
    Cyclists get hurt but don’t report it in Bolton OR
    Bolton police don’t record data as accurately as other police forces

    Tom Taylor: thanks for the KML files!

  11. I’ve built a map showing the locations for 2007 as an overlap for Google maps.

    http://labs.timesonline.co.uk/blog/2009/03/11/uk-cycling-accidents/

    @tom good work with the KML, could have saved me working out how to convert to lat long. ;)

    @andy I agree it should be clearly licensed (ideally CC) so it is clear how it can be reused.

  12. Paul Battley says:

    Judging by the comments so far, everyone wants lat/long rather than OS grid references. Maybe that’s a good reason to add lat/long to all geocoded data, and save people the overhead of conversion.

  13. [...] As a result, directgov have swung into action ( not before time some may say ) and released statistics on bicycles [...]

  14. Tom Taylor says:

    @Sam - From my understanding of maps.google.com, it only shows a maximum of 100 points from KML files. That might be why you’re missing points.

    If you want to explore the data in depth, either try Julian’s tool from the Times lab, or load it up in Google Earth.

  15. John Girvin says:

    Interesting data, and thanks to all who have repackaged it.

    Is anyone aware of a similar data set for Northern Ireland?

  16. [...] government Innovate Blog has published some raw data from the Department for Transport, giving the location of accidents involving pedal cycles which caused personal injury and were reported to the police, for the years 2005, 2006 and [...]

  17. james says:

    Also the site shows convincing lack of clarity in where data is and what formats its presented in….(or at least i did not find it)

  18. Paul Battley says:

    James, it’s linked from the word ‘here’s’ in the article. It took me a while to find it, too: that shade of purple makes it hard to distinguish links from the surrounding text. Underlining them would help.

  19. Andy Mabbett says:

    @Anthony Cartmell: I’m seeing none in Birmingham, in 2007 (which is what I looked at last time); and only one in 2006, but several in 2005

    @Paul Battley: with you on the “hidden” link styles.

  20. David Earl says:

    What is the copyright position of this data? There aren’t many sources of grid references other than OS maps, so chances are this data is a work derived from OS data and as such is crown copyright (irrespective of whether the DfT is happy to release the data for free use itself). Therefore OS will not allow the data to be used, especially with other map data such as Google maps or OpenStreetMap. OS explicitly wrote to licensees recently to prohibit them overlaying co-ordinate data derived from OS map bases on Google maps: http://www.freeourdata.org.uk/docs/use-of-google-maps-for-display-and-promotion.pdf

  21. Kes says:

    Great to see this kind of data being made available, as others have said could the reuse/licence conditions be made clearer?

    If anyone is interested the source road accident datasets are available from http://www.data-archive.ac.uk/findingdata/snDescription.asp?sn=6022&key=Stats19 - unfortunately the conditions of use seem to prohibit the mashups seen here (the archive is geared towards academic research, but does permit personal registrations).

    Whilst this still wouldn’t include traffic counts it would be possible to compare the ratios of accidents between vehicle types (+ pedestrians).

  22. Andy says:

    Anthony Cartmell’s point about location vs. human error is well put, but here’s a bit of anecdotal evidence (you can see the map location in Perth for this one):
    I was wiped out by a car pulling into the cycle lane to get round a line of stationary traffic (waiting to turn right), at the point the road is just widening from one to two lanes. It was therefore clearly driver error, but it was driver error that was entirely dependent on the road layout and traffic conditions at that point.
    If we are actually trying to improve things, then we need to look at ways of reducing human error and situational factors.

  23. Tony Singleton says:

    This is great for identiyfing blackspots but very sad to see so many pins in it.

  24. Bill Watson says:

    I’ve put all 50,000 points on a google map which you can view here:

    http://www.billwatson.co.uk/cycle_map/

    You have to zoom right in to see any data as it does not like to many markers on one page. Be warned that zooming in on London can be very slow.

    Also this page may not stay here for ever.

  25. [...] Pedalling some raw data… « innovate.direct.gov.uk (tags: visualization uk database government statistics) [...]

  26. Richard McMillan says:

    This is a good start however a few points:

    - The location data seems to be wrapped up in OS copyright (any chance of std lat/lng)
    - the data is old - though could be useful for trending
    - There is not much data to go on - time of day, week of year or some other contextual data could make it much richer

  27. [...] Gov Innovations subsequently previously released accident data involving cyclists for the UK in the years 2005-2007 on the 10th March as a spreadsheet for re-use and within 24 hours a blogger had published a KML [...]

  28. Phil Shore says:

    Bill,

    What do the different colours for the pins mean ?

    I would like to see a similar map for motorcycle accidents. Is that data included on your map ?

    Phil.

  29. Bill Watson says:

    Phil,

    The map is only of the data provided at the start of this thread.

    The colours show the three different years the data covers. Key at the bottom.

    Regards

    Bill.

  30. Now you can’t actually copy the data out of that file as that would violate copyright unfortunately. But this sort of information is why I created BikeBingle (http://bikebingle.appspot.com/). It’s a user generated map of cycling incidents.

    Unfortunately data is pretty sparse (non existent) across the UK, but feel free to change this ;-) .

  31. Sam says:

    Useful website thank you. Oh - and to that twat that rammed into me on Park Lane today - you know who you are.. I hope you’ve had a really bad day. Maybe you should think about taking the bus in future.

    Regards..

  32. Tom Hawkins says:

    Nice one for giving this data out and for the others that have run with it.

    I broke my back in a bike accident in December that I now see is a bit of blackspot with three other accidents in the same location ‘05-’07. It would be good if the data could now be filtered to only include accidents which are within x metres (50?) of another accident to highlight blackspots.

    I also second the call of Richard McMillan (above) for more contextual information. Also, more detail on the source of this information would be good so that its accuracy can be assessed.

    I’m slightly blown away by the possibility that there is some poor sod somewhere that has to work out the map coordinates of all reported RTAs but I am glad somebody has seen sense to freely provide this information to the public (why was it gathered if not to do that?). If the coordinate data has ‘involving bicycle’ or ‘personal injury to bicyclist’ data associated with it then there must be other data that could be given out.

  33. Tony Singleton says:

    I thought you might to see this which also uses the kml files and has a good article wrapped round it http://road.cc/content/news/2918-govt-data-leads-creation-map-uk-cycling-accidents

  34. [...] Pedalling some raw data… « innovate.direct.gov.uk - Data from the Dept for Transport is released about where cycling accidents have been taking place. Some interesting discussions in the comments. It is of course a good move, but why is the data in xls and not an open format? [...]

  35. PaulG says:

    Just because the data came with Eastings and Northings does not always mean it was derived from GIS systems which rely on OS maps, it just means that is how it was exported.

    The original report may have been made as text ‘accident at junction of Timbers Lane and Four Ways’ or could have come from a satnav device.

    It may have been captured (seed pointed) as Lat/Lng - but was exported somewhere along the line as Eastings Northings.

    Anyhow, it seems to me to be a good idea to supply both E/N and Lat/Lng for each point,

    I echo the points about making the whole thing available as csv data, and agree that kml ought to be standard too.

    I also think a lot of users would benefit from having it available as premade sql statements too.

    Nobody has mentioned character encoding standards either.

    All of this kind of export format stuff can be semi-automated, and this small bike accident example, if nothing else, should get the community thinking how to share out and manage this type of work.

  36. [...] Cycle accidents By underyourownsteam A source of statistics [...]

Leave a Reply