Last week I’ve outlined the basic idea behind the indicator-based assessment model for road networks. I hope it became clear how this spatial modelling approach opens up new and more efficient ways for the global assessment of road-networks – not only in the context of bicycle safety.
A valid objection in this context could be, that the effort for setting up such a model is disproportionally high. This might be true at a first glance. But once the model is established, there’s no more efficient way to assess the quality of road networks independently from their size. Besides, taking a closer look to commonly employed assessment methods shows their respective practical and conceptual shortcommings. Let me show this by briefly discussing three assessment approaches which are utilized in connection with bicycle safety.
1) Analysis of accident locations
To my current knowledge this approach is most often used for determining the safety risk for bicyclist on the level of road segments. The concept is quite simple: many bicycle accidents within a certain distance interval indicate dangerous road segments. … sounds striking, but is actually based on a conceptual fallacity. Only the number of accidents doesn’t tell me anything, except the fact that accidents happened. One cannot conclude that road segments with many accidents are unsafe and vice versa, that segments with few accidents are safe. The following example tries to make this clear:
On a cycle way with a high load of bicycles a certain amount of accidents within a given timeframe is reported. A primary road with a high motorized traffic load and without any bicycle facility runs parallel to this cycle way. This road is avoided by most bicyclists (= low bicycle load). On this road only a few bicycle accidents are reported (= low number of accidents). A quality assessment which is exclusively based on reported accident locations, would falsely rank the primary road higher than the cycle way.
Any assessment result which is exclusively based on reported accident locations is biased for at least two reasons: First, without a sound exposure variable (number of bicyclists per segment, distance travelled etc.) basically nothing can be said about the risk. Second, only a very small fraction of bicycle accidents are reported. This reporting-biase directly affects the quality of any analysis based on official accident data.
The analysis of collected accident reports is valuable (if not necessary) for many reasons. But building any assessment approach on the location (or density) of accidents cannot lead to valid results.
2) Expert assessment
If you have ever talked to a cycling advocat, a road engineer or a police officer you might have learned that their experience and practical knowledge is of enormous value when it comes to the quality assessment of road networks. Sometimes such experts are responsible for a status-quo analyis of road networks in terms of bicycle safety. If the size of the road network under consideration exceeds a few roads this approach turns out to be a real sisyphean task! Everytime a road is physically modified (which happens non-stop in a city!) the expert needs to re-assess the respective road. Apart from this tremendous effort the expert assessment has two additional weak points:
- Most of the time experts use ordinal rankings or even “worse” qualitative, verbal descriptions for their assessment. Such an approach inevitable leads to fuzzy classifications and consequently only vague results. These results are hardly useable in further spatial analysis.
- An assessment system which exclusively relies on expert’s judgements is hard to be standardized. This means, that the results are hardly comparable for different time intervals and geographical regions.
As already mentioned in my last post, experts play a central role in any road network assessment approach. Incorporating their extensive knowledge in adaptable models which can be globally run as often as necessary is by far more efficient than asssessing every road individually.
3) User feedback
Using collective user feedbacks which can be explicit (e.g. feedback app) or implicit (e.g. Twitter messages) for quality assessment seems to be a quite promising approach. Nevertheless there are still several concerns which makes this approach unfeasible for a global road network assessment:
- Feedbacks are generally not equally distributed over space. For central areas in cities the number of feedback messages might be sufficient, but for a global application the sample size is too small.
- Voluntary user feedbacks are biased in several ways. The sample is not representative for the whole population (“tech-savvy young male”) and thus can hardly be used for general assessment purposes.
- In order to transfer verbal, qualitative feedbacks into reliable conclusions, sophisticated (semantic) algorithms are required.
User feedbacks are very helpful for the validation and calibration of assessment models. Furthermore they can help to improve infrastructure very efficiently (open administration). But a global quality assessment that relies entirely on user feedbacks is not reliable and thus hardly applicable.
I hope the benefit of the indicator-based assessment model which I’ve explained in my last post are clear(er) now. None of the methods mentioned above are wrong as such. In fact, they are useful for many purposes – for example as initial model parameters or for validation and calibration of the model. But for a global assessment of road networks with a focus on bicycle safety they are simply not viable.
Have I missed anything? Or do you have experiences with one of the outlined methods? Please feel free to leave a comment, I’d be happy to read your opinion!
Several international, national and even local initiatives aim to reduce the number of road accidents. This ambition is prominently supported by the United Nations which proclaimed the “UN decade of action for road safety 2011-2020”. In its annual report on road safety – dedicated to support the UN decade of action – the World Health Organization (WHO) claims:
"Policies to encourage walking and cycling need additional criteria to ensure the safety of these road users. [...] Promoting city cycling to reduce congestion cannot be encouraged if cyclists repeatedly find that their lanes cut across oncoming traffic." (p.30)
In order to follow this recommendation, all bicycle promotion strategies need to ensure high-performance – but above all – safe infrastructure and user-tailored information about safe bicycle connections. For both, infrastructure and information, a sound data basis is needed. Geographic information systems (GIS) can serve as powerful platforms for consolidating and compiling digital data about the road network and the whole road space. This data basis can then be used for advanced modelling and analysis purposes.
Starting point for any initative dedicated to improve bicycle safety is to assess the road network’s quality in terms of potential risks for bicyclists. Based on this status-quo analysis the existing infrastructure can be improved where it’s most needed and bicyclists can be informed about safe(r) routes.
There are at least three approaches for quality assessment (expert evaluation, analysis of accident locations, user feedback) but each of them has several drawbacks – I’ll come back to this in another post. Different to these approaches we have developed an assessment model where we make use of geospatial modelling power. Conceptually this model is quite simple as can be seen in the figure on the left. The basic idea is to identify those “indicators” which contribute to the potential safety risk for bicyclists, such as presence and design of bicycle infrastructure or motorized traffic load. These parameters are then weighted and compiled in a GIS-model.
For the identification of the indicators empirical studies are reviewed, experts and users are interviewed and accident reports are systematically analyzed. These sources also serve as proxy for the impact of every single indicator on the overall-risk, expressed as weight in the model. Depending on the environment (urban, rural), data availability or user’s preferences these weights can be easily adjusted. Finally all indicators with their respective weights are compiled in the indicator-based assessment model which can be applied to any road network. It calculates a dimensionless index value which expresses the suitability of every single road segment for bicyclists. Low values indicate a low safety risk and vice versa. Due to the linear design of the model indicators can be added or removed without affecting the model’s performance. Generally it can be said, that the more (non-redundant) indicators are used the better the explanation power of the model is.
The indicator-based assessment model can be applied in any GIS for the calculation of the index value on a road segment level. The computed result is then evaluated by experts and users. If necessary, the model can be iteratively adapted either on the level of the indicators or the weights.
Compared to alternative assessment routines, the advantages of the GIS-based modelling approach can be briefly summerized as following:
- Transparency: the results of the assessment procedure can be traced back to the building blocks of the model; all parameters and weights are accessible – there’s no black box or subjective component (as e.g. in expert evaluations).
- Comparability: the model is the same for the whole road network; thus the results, even on a segment level, can be easily compared.
- Adaptability: due to the linear model design and the implementation of weights, the model can be adapted to any environment, data availability or user preferences. It is transferable and geographically scalable.
- Reproducibility: Once the model is compiled it can be integrated in automatic assessment workflows. This allows for short update intervals and employment in simulation routines.
If you wonder what this modelling approach can be used for “in the real world of bicyclists”, have a look to a really nice web application: www.radlkarte.eu . This routing platform is actually based on the described model. It’s quite innovative for at least two reasons. First, it is exclusively designed for bicyclists and has never been a car navi … The calculation of safe routes (for legal reasons they are called “empfohlene Route” = recommended routes) is a big deal especially for kids, elderly people or families. Second, it successfully shows the applicability of a rather sophisticated GIS workflow: different data sets with different data models are combined (OpenStreetMap and authoritative data from the city administration), the routing works across a national boarder (Austria and Germany) and the architecture of the systems allows for further adaptions (it is e.g. planned to implement personalized routing information).
For the risk assessment of a whole road network, specifically for bicyclists, GIS facilitates pretty innovative solutions!
Writing on my conference paper for the AGIT symposium about using crowd sourced data for modeling and analysis purposes I came across a brilliant blog post by Muki Haklay (see here ). His contribution basically points to the fact, that all data are more or less biased (in terms of completeness and consistency) but data providers are not equally honest to commit it. That’s nothing really new – at least everyone with a professional GI background should be aware of this – but it was nicely condensed.
Interestingly, most studies dealing with the quality of crowd sourced data – in a GIS context most of the time OpenStreetMap data – use authoritative or commercial data sets as references. For some questions this approach is definitely useful; e.g. one can monitor the coverage of OSM data over time and see the project geographically expanding. But it has its limitations when one wants to deduce information about the quality of the data in terms of their spatial and especially attributive characteristics. Comparing a potentially biased data set with another potentially biased data set is definitely a tricky thing – even more when you draw global conclusions about the quality of the data.
In a project related to the routing platform www.radlkarte.eu I’ve used authoritative and crowd sourced road data sets for the same purpose: assessing the road network’s quality in terms of bicycle safety. In such a task the quality of the respective data sets becomes immediately obvious. Of course, OSM data partially suffer from attributive gaps, wrong classifications, simple mapping errors or heterogeneous attributes. But on the other side they are – at least in my project area – spatially more accurate, more complete and above all more up to date.
Generally, the quality of authoritative (and commercial) road data tends to decrease in areas with low level roads or roads with limited access for motorized vehicles. In the context of bicycle traffic this is a major drawback because these are exactly the roads bicyclists prefer! Now, if you use authoritative or commercial data sets as reference in such a context you won’t necessarily be able to say anything about the quality of the crowd sourced data set! Determining the quality of data sets – no matter whether you have crowd sourced, commercial or authoritative data – heavily depends on the purpose the data sets are used for. Imagine you build a routing service for bicyclists on your data set. No bicyclist will use your service if a major link is missing. Perhaps the road is not traversable for cars and that’s why it is regarded as dispensable. But for bicyclists this connection is an important shortcut and might be the reason for using the bike instead of the car for their travel to work …
A nice webpage for the comparison of authoritative and crowd sourced road data sets is www.basemap.at . It’s a web map tile service which is fed by authoritative data from Austria’s federal states and city administrations. The service claims to be up to date and most accurate. Well, this might be true for the high level road network. But “unfortunately” an OpenStreetMap rendering is provided as alternative base map on the same webpage. And comparing these two base maps proofs the provider’s claim to be wrong – at least when it comes to links which are essentially for bicyclists. Here are two nice examples (screenshots from today):
Don’t take me wrong. The basemap.at project is to be appreciated. It’s a first and very important step to open the administrations’ data treasure for a wider audience. Nevertheless the project (I guess non-voluntary) helps to stimulate the discussion about data quality and reference data sets for quality assessment. Authoritative data sets are good, most often very good. But with regard to several special purposes – as bicycle traffic – crowd sourced data sets might be even better.
A quality assessment of digital road network data sets can only be plausible when 1) the purpose of the data is defined and 2) potential biases in the reference data sets are considered.