Studying a map with geo-located bicycle crashes might leave you with the impression that cycling must be terribly dangerous. A little bit of rudimentary statistics definitely helps at this stage. Whether something is regarded as dangerous or not ultimately depends on the underlying statistical population. This is a common concept for example in medicine. In a drug’s package insert, the risk for suffering from adverse effects is always related to a population. This helps the consumer (or the medical doctor) to draw informed conclusions. The risk, or incident rate, expresses the probability for a drug to become dangerous. Something similar is still missing for cycling. At least at the local scale level.
Alberto Castro and colleagues published an extensive study on exposure-adjusted road fatality rates of pedestrians and cyclists just recently. Compiling data from most OECD countries, this is the first systematic study of this kind and a huge step forward. However, the calculated fatality rates are based on very highly aggregated statistics. In the authors’ own words, ‘exposure data was found to be generally poorer than for fatality data, as travel distances of active modes are not systematically collected in all countries’ (Castro et al. 2018 , p. 8). Now, if the availability of exposure data is poor on at national level, how to interpret local crash data or so called crash blackspots – a challenge planners and authorities are facing in cities on a daily basis?
In fact, the problem is that we do not know where, when, how many and which types of cyclists are on the road in most cases. Consequently, we are lacking required exposure data. Moreover, we can only roughly estimate the demand for infrastructure and the respective capacity, and finally, the effect of interventions remains opaque. Regarding the interpretation of crash data, the map below perfectly illustrates the problem. The message of the mapped bicycle crashes seems to be pretty obvious: on the road along the river, crashes are recorded every few meters (crash data were collected over a 10-year period). It looks like this road was quite dangerous for cyclists. In contrast, on the parallel road no crashes are recorded at all. Is this the safe alternative? Well, a quick look at the attached street views answers the question instantaneously. The road along the river is a highly frequented cycle way, whereas the parallel road is exclusively dedicated to motorized traffic (plus a narrow sidewalk). Because hardly any cyclist is riding there, no crashes occur. The probability of being involved in a crash is, at least in part, a function of traffic volume.
In order to overcome the limitation of missing exposure data, we have been working on an agent-based bicycle flow model with a very high spatial and temporal resolution in the collaborative research project FamoS . This week, I’m going to present results from this research at the International Cycling Safety Congress in Barcelona.
The building blocks of our agent-based simulation model are single trips. We expect flow patterns to emerge from individual mobility behavior, which is determined by multiple parameters. With this approach, it is possible to anticipate the heterogeneity of cyclists. Different to motorized mobility, where the machine levels out different capabilities (elderly people are able to drive at the same speed as youngsters), the variety of behavioral variables and riding styles is huge among cyclists. A 4-year old girl on a bicycle has little in common with a bike courier, just to give an example.
In our model, we can control for different socio-demographic variables, such as age, education or employment status. Additionally, we differentiate between different trip purposes, trip length and accessibility of destinations. After initialization, we simulate agent’s activities, schedules, destinations, mode choice and route choice. Of course, such a model requires many data. In the case of a model we developed for the Salzburg central region, we used:
- Topological correct road graph with a rich set of attributes
- Census data with a spatial resolution of 100 and 250 meters from Statistik Austria
- Central facilities and POIs from OGD portals
- Raw data from mobility surveys
- Time use statistics from representative surveys
The bicycle flow model was developed from scratch by Dana . She did an excellent job by translating the conceptual model into code. The simulation model runs on GAMA platform and will be freely available at OpenABM soon. Through a very efficient code structure, we are able to initialize and run the simulation model for a 24-hours day, with 150,000 agents, a temporal increment of one second and a spatial resolution of one meter within only five hours. This very fast runtime makes the model perfectly suitable for sensitivity analysis and simulation of various interventions, such as additional connections or changing behavior of travelers.
The model was calibrated with data from six stationary counters. For model validation, we used tracking data from Bike Citizens . In total, we achieved very accurate results. The temporal distribution of bike rides perfectly matches the double peak signature in reference data. The simulated spatial pattern of bicycle flows has a little bias towards the left side of the Salzach river. Reasons for this are expected to be associated with known biases in the input data. In the validation, we also compared the characteristics of simulated and recorded trips. Whereas the mean travel time is much higher for Bike Citizens data (mainly due to a few outliers, generated by long distance leisure cyclists), the average distance and speed distribution resembles perfectly.
To the best of my knowledge, this is the first bicycle flow model at the local scale level with a regional coverage. We use the results for multiple purposes. For example, we could simulate the expected effect of planned bicycle corridors in the city of Salzburg. In the context of safety, the model is well suited to generate exposure data for risk analysis. Referring to the example above, the simulated bicycle flows can be nicely used to calculate incidence rates and subsequently assess the safety of areas and road segments (we published a paper in Safety on this topic). In the particular case presented above, it becomes evident that the road along the river is definitely not dangerous. The recorded crashes can be expected from the number of cyclists. However, it is beyond any discussion that every bicycle crash is one too much. Providing adequate infrastructure is crucially important for attracting cyclists and ensure safe rides. Here again, the simulation model helps to estimate the demand and derive required capacities for dedicated infrastructure.
With the agent-based simulation model, we have made a step forward in providing sound evidence to decision makers and bicycle advocates. Nevertheless, it is still a model and thus, it does not mirror reality, but a generalized representation. In order to further refining the model we are currently improving the input data basis In the research project Bicycle Observatory , we combine quantitative and qualitative data from different sources for getting an integrated perspective on bicycle mobility. This will help us to include even more parameters in the model and hence, provide a more accurate representation in the spatial, temporal and behavioral dimension.
Earlier this year we published a very detailed spatial (and temporal) analysis of bicycle crash data from Salzburg (Austria) in Transport Geography . In this paper we demonstrated the additional benefit of an explicit spatial perspective on crash reports. However, one of the major objections was, that meaningful conclusions from such an analysis can only be drawn when an exposure variable is introduced. This objection stems from the well established methodology of risk calculation in bicycle safety analysis (the quality of commonly used exposure variables is a whole different story as I’ve exemplified in an earlier post ).
Because of the lack of sound exposure variables on the local scale – this is the scale I’m especially interested in – most bicycle risk analyses are done on a highly aggregated level. Last year we were, at least partly, successful in overcoming this shortcoming. With an agent-based simulation model (Wallentin & Loidl 2015 ) we estimated the traffic flow for every road segment in an urban road network. This model allowed us to take the final step now: bicycle risk estimation on the local scale.
Theoretically we are able to calculate incident rates (commonly used synonymously with “risk”) for each and every road segment. However, thanks God, bicycle crashes are relatively rare; and officially reported ones are even rarer. Consequently the statistical robustness of calculated incident rates is weak, leading to analysis results that are potentially biased by random effects. Thus, we defined and investigated different spatial reference units, which served as spatial aggregation levels:
Whenever point incidents are spatially analyzed, two well-known and still challenging phenomena need to dealt with: spatial heterogeneity and the modifiable areal unit problem (MAUP).
Although the Geography literature on these two implications is full, they are hardly ever anticipated in (bicycle) crash analyses. We therefore regard our paper not only as a presentation of our analysis results, but also as an example for how to adequately deal with geo-located data.
Here is the abstract of the paper (full text ), which was published in a special issue of the OA journal “Safety” :
Currently, mainly aggregated statistics are used for bicycle crash risk calculations. Thus, the understanding of spatial patterns at local scale levels remains vague. Using an agent-based flow model and a bicycle crash database covering 10 continuous years of observation allows us to calculate and map the crash risk on various spatial scales for the city of Salzburg (Austria). In doing so, we directly account for the spatial heterogeneity of crash occurrences. Additionally, we provide a measure for the statistical robustness on the level of single reference units and consider modifiable areal unit problem (MAUP) effects in our analysis. This study is the first of its kind. The results facilitate a better understanding of spatial patterns of bicycle crash rates on the local scale. This is especially important for cities that strive to improve the safety situation for bicyclists in order to address prevailing safety concerns that keep people from using the bicycle as a utilitarian mode of (urban) transport.
With this analysis we have successfully demonstrated that mapping bicycle risk patterns on the local scale reveals relevant information for policy makers and authorities, which aggregated approaches would not have been able to uncover. To our current knowledge this is the first study, which calculates crash rates on the local scale. However, with the increasing amount of available data and improved (spatial) models, we are quite sure that many more analyses like this one will follow – for the good of bicyclists and building blocks for evidence-based safety strategies.
As the number of geographers dealing with bicycle safety and crash analysis is rather small, I’d be more than happy to read from you. Do you have any questions, ideas for further studies, data or just a comment – feel free to leave your note below, connect on Twitter or get in touch with me via the contact form.
The official name of our department at the University of Salzburg reads a bit cumbersome: Interfaculty Department of Geoinformatics. Now, there is an administrative reason for this (for details have look at our website ). But by far more important is the philosophy behind the prefix interfaculty. It means that GIS is regarded as cross-sectional tool- and mindset.
If you’re interested in one of the many outcomes of such inter-disciplinary work, you might join one of our workshop sessions (“Spatial perspective on transportation modelling”) at the GI-Forum conference in Salzburg next week. It’s organized by my colleague Gudrun Wallentin and myself. Gudrun is an ecologist by training and an expert in spatial simulation. An ecologist working together with a geographer on transportation issues – can there be any relevant outcome? Well, I’d like to give you an example from my current PhD research …
On several occasions I’ve already pointed to the benefits of a geospatial analysis of bicycle accidents. Knowing where (and of course when) accidents happen is a crucial information for targeted counter measures. As long as the analysis exclusively focus on accident frequency, you are fine with geocoded accident reports (apart from data quality issues and severe underreporting). But when it comes to risk calculation it becomes tricky. Here are two examples how risk calculations are commonly done:
1) Accidents per inhabitants per census district.
This migth be a valid approach if large areas were compared with each other, but on the city level useless results are produced. Have a look at the map of Salzburg below. On the left side the number of accidents per inhabitants is calculated for each census district. High risks are indicated at the periphery although the absolute number of accidents is comparably low. This if of course due to the fact that relatively few people live in this areas (the aerial image of the city gives you a perfect overview) while they are frequently traversed by commuters and leisure bicyclists.
2) Accidents per distance travelled per census district.
Yiannakoulias et al. (2012) , for example, use this approach. They estimate the total distance travelled from commuting data extracted from the Canadian census. While the presented results look reasonable, they don’t allow for a downscaling to the street level. Apart from this, the data availability is not always that good. Consequently the total distance travelled – independently from the scale of the reference units – is subject to numerous assumptions and estimations.
There are some more approaches which pop up from time to time (recently a reviewer suggested to me to relate bicycle accidents to LULC data …), but in the end we always face the problem that we don’t have a glue how many cyclists are actually on the road. For motorized traffic sophisticated traffic flow models exist. They are based on huge amounts of data from an extensive network of counting stations and on board sensors.
With an equivalent for bicycle traffic sound risk calculations for each road segment and different points of time would be possible. Currently two major drawbacks (at least in most cities) make it impossible to simply transfer MIT models to bicycle traffic: there is no obligation to register bicycles (thus we don’t know the statistical population) and very few counting stations. The latter issue is partly met by VGI data, such as data from the fitness app Strava (see Griffin & Jiao (2015) ). But these data are neither representative for the whole traffic (the focus lies on leisure trips) nor for the whole population (the app is used by a non-representative fraction of the bicyclists).
Discussing these issues with Gudrun (over several interfaculty cup of coffee) brought us to the idea to test the applicability of agent-based (ABM) models for simulating bicycle traffic flows in an urban network. Using ABM in the transportation modelling domain is a real “minority program”. There is a very inspiring overview paper by Bazzan & Klügl (2014) , but apart from this very few literature actually does exist. To my current knowledge ABM has never been used for the simulation of bicycle traffic flows.
At the above mentioned workshop session at the GI-Forum conference Gudrun and I are going to present the results of our first try (pre-print of the paper ). To be honest: I didn’t expect such nice results. While there are several issues which need to be improved, the results definitely push us to further work on this topic. And of course, to use the simulated bicycle traffic flows for risk calculations.
As a result from the ABM we have simulated bicycle flows for every road segment and for every point of time. This allows for a risk calculation (or to be more precise: risk estimation) on the most detailed scale level. For the risk estimation the reported bicycle accidents for the years 2002-2011 from the city of Salzburg (Austria) are used. The bicycle traffic flow (number of trips per segment) is the averaged sum for one year. The following analyses use a regular hexgrid as reference unit. Alternatively the single road segments could have been used.
In the left map the total number of accidents within the 10 years of observation are related to the reference units. Accident hot-spots along the main bicycle corridor along the Salzach river become obvious. Relating the accident occurrences to the total network length (center) offers little additional explanation. Spatial clusters of bicycle accident occurrences emerge along the most frequented roads. Both maps indicate hot-spots in the city center, what could lead to the misinterpretation, that these are dangerous places for bicyclists.
Only from the right map information about dangerous places, that are segments with a high risk, can be deduced. Compared to the other two maps the image flips: the risk along the Salzach river is much lower than in the periphery. Risk hot-spots emerge where the quality of the bicycle infrastructure is comparably low and the MIT volume is high.
From this simple example several conclusions about accident risk for bicyclists can be drawn. But the point I want to make here is to demonstrate how useful ABM is in this context. It helps to gain a rough idea of the spatial and temporal distribution of bicycle flows and it tackles the constant problem of data (and information) shortage. Compared to aggregated statistics ABM allows for analysis on a much more detailed level. Once having risk estimates further analyses and reasoning are possible. For example the correlation between infrastructure and risk can be investigated. Or the question to which degree the number of bicyclists on the road increases (or decreases?) the overall risk can be answered. You see, we see lots of work ahead!
If you have any ideas how the merge of ABM and GIS can be further used in the transportation domain, if you have suggestions for improvements or if you are an expert in any related domain that wants to discuss over some more interfaculty cups of coffee – please feel free to use the comment or contact function! And if you are the GI-Forum conference anyway, join us on Wednesday, 8th July, at 1pm in room 413 (first floor). I’m looking forward to inspiring conversations!
First, an interfaculty department is great – it brings together an ecologist and a geographer to work on transportation modelling.
Second, ABM helps to simulate bicycle traffic flows, which can serve as input for risk estimations.
Currently a vivid discussion about HGV’s threat to bicycle safety is going on. A major reason for this is a series of fatal accidents in London .
In the wake of this discussion, the European Cyclist’s Federation (ECF) published a blog post by the organization’s Safety Policy Officer, Ceri Woolsgrove. He investigated the ratio between distance travelled and number of fatal accidents within London for HGVs and buses respectively. The results, although based on “lukewarm data”, interestingly show that HGVs are at least three times more often involved in fatal bicycle accidents than buses. The following points have become (once again) important to me:
- Data availability is critical! Political and technical decisions – such as a general ban of HGVs in city centers, safer road design or technical standards and safety regulations for vehicles – are being made on a rather weak data basis. This looks pretty bad from a scientific point of view, but apart from this, it’s about human lives!
- Safety concerns are valid. A major argument for people against using their bicycle for utilitarian trips is a widely adopted fear of being involved in an accident. If cities want to increase the bicycle’s modal split – and there are numerous good reasons for it! – a safe environment is a key factor.
- A multi disciplinary collaboration is required. I’m totally convinced that enough technology and methodology has been developed in various fields which could/should be utilized for a better understanding of bicycle safety and for better decisions in this context. In a recent book chapter I’ve provided a collection of spatial applications, which could help to improve bicycle safety (pre-print can be found here ). I’m sure designers, statisticians, programmers, psychologists and many more, who are not at the “natural” core of mobility research have a lot to say and should actively contribute to integrated safety strategies.
Apart from this foundational associations and conclusions, I became curious about the situation in the town I live. In a previous blog post I’ve already presented some details from an accident database from Salzburg (Austria), which comprises 3,048 geo-located police reports. Here is how the situation actually was in Salzburg during the last decade (data from 2002-2011):
Although this statistical analysis suffers from several limitations (e.g. no statistical population, sample size for subgroups etc.), interesting conclusions can be drawn from this overview.
First, a severe under-reporting becomes obvious: primarily accidents with material damage and/or physical injury are reported. Consequently the injury category “uninjured”, for example, does not contain single bike crashes (SBC).
Second, by far most accidents occur on residential roads where bicycle infrastructure (separated bicycleways or at least on-road bicycle lanes) is widely missing.
Third, fortunately the number of fatal bicycle accidents is comparable low. Hence the sample size is too small to conclude that HGVs are significantly more dangerous than other vehicles. Nevertheless the relative share of large HGVs is around 37%.
Fourth, we must not forget the bicyclist as risk factor. This has nothing to do with victim blaming ! But the second highest number of accidents where the bicyclist is injured are single bike crashes.
There might be plenty more implications and without any doubt, there is a lot more to do for better, that is – among others – safer, road environments. But analysis such as the one presented by Ceri Woolsgrove are an important starting point for evidence-based decisions and targeted actions.
Any comments, questions, additional insights or specific experiences? I’m looking forward to your feedback!
Several international, national and even local initiatives aim to reduce the number of road accidents. This ambition is prominently supported by the United Nations which proclaimed the “UN decade of action for road safety 2011-2020”. In its annual report on road safety – dedicated to support the UN decade of action – the World Health Organization (WHO) claims:
"Policies to encourage walking and cycling need additional criteria to ensure the safety of these road users. [...] Promoting city cycling to reduce congestion cannot be encouraged if cyclists repeatedly find that their lanes cut across oncoming traffic." (p.30)
In order to follow this recommendation, all bicycle promotion strategies need to ensure high-performance – but above all – safe infrastructure and user-tailored information about safe bicycle connections. For both, infrastructure and information, a sound data basis is needed. Geographic information systems (GIS) can serve as powerful platforms for consolidating and compiling digital data about the road network and the whole road space. This data basis can then be used for advanced modelling and analysis purposes.
Starting point for any initative dedicated to improve bicycle safety is to assess the road network’s quality in terms of potential risks for bicyclists. Based on this status-quo analysis the existing infrastructure can be improved where it’s most needed and bicyclists can be informed about safe(r) routes.
There are at least three approaches for quality assessment (expert evaluation, analysis of accident locations, user feedback) but each of them has several drawbacks – I’ll come back to this in another post. Different to these approaches we have developed an assessment model where we make use of geospatial modelling power. Conceptually this model is quite simple as can be seen in the figure on the left. The basic idea is to identify those “indicators” which contribute to the potential safety risk for bicyclists, such as presence and design of bicycle infrastructure or motorized traffic load. These parameters are then weighted and compiled in a GIS-model.
For the identification of the indicators empirical studies are reviewed, experts and users are interviewed and accident reports are systematically analyzed. These sources also serve as proxy for the impact of every single indicator on the overall-risk, expressed as weight in the model. Depending on the environment (urban, rural), data availability or user’s preferences these weights can be easily adjusted. Finally all indicators with their respective weights are compiled in the indicator-based assessment model which can be applied to any road network. It calculates a dimensionless index value which expresses the suitability of every single road segment for bicyclists. Low values indicate a low safety risk and vice versa. Due to the linear design of the model indicators can be added or removed without affecting the model’s performance. Generally it can be said, that the more (non-redundant) indicators are used the better the explanation power of the model is.
The indicator-based assessment model can be applied in any GIS for the calculation of the index value on a road segment level. The computed result is then evaluated by experts and users. If necessary, the model can be iteratively adapted either on the level of the indicators or the weights.
Compared to alternative assessment routines, the advantages of the GIS-based modelling approach can be briefly summerized as following:
- Transparency: the results of the assessment procedure can be traced back to the building blocks of the model; all parameters and weights are accessible – there’s no black box or subjective component (as e.g. in expert evaluations).
- Comparability: the model is the same for the whole road network; thus the results, even on a segment level, can be easily compared.
- Adaptability: due to the linear model design and the implementation of weights, the model can be adapted to any environment, data availability or user preferences. It is transferable and geographically scalable.
- Reproducibility: Once the model is compiled it can be integrated in automatic assessment workflows. This allows for short update intervals and employment in simulation routines.
If you wonder what this modelling approach can be used for “in the real world of bicyclists”, have a look to a really nice web application: www.radlkarte.eu . This routing platform is actually based on the described model. It’s quite innovative for at least two reasons. First, it is exclusively designed for bicyclists and has never been a car navi … The calculation of safe routes (for legal reasons they are called “empfohlene Route” = recommended routes) is a big deal especially for kids, elderly people or families. Second, it successfully shows the applicability of a rather sophisticated GIS workflow: different data sets with different data models are combined (OpenStreetMap and authoritative data from the city administration), the routing works across a national boarder (Austria and Germany) and the architecture of the systems allows for further adaptions (it is e.g. planned to implement personalized routing information).
For the risk assessment of a whole road network, specifically for bicyclists, GIS facilitates pretty innovative solutions!