Tagged: data

We need more data!

You have come across the claim that cycling is on the rise in cities all over Europe for sure. However, if you are looking for the statistics behind it you will be disappointed. Just try it and google for modal split development and cycling.

In their seminal paper Data driven geography internet, Miller and Goodchild state that “The context for geographic research has shifted from a data-scarce to a data-rich environment […]”. Thinking of the huge amount of data generated by an unprecedented number of sensors, this observation is absolutely true. Still, the story is a little bit different when it comes to the geography of cycling. Although the data volume is growing there as well – mainly due to the quantified-self-movement and the vast number of fitness and tracking apps – we are still in the situation that we cannot answer fundamental questions such as:

How many cyclists are on the road?

Where and when do they move through space?

How did the modal split develop over the past ten years?

The lack of adequate data and derived information is serious for a number of reasons:

  1. As long as the status-quo of cycling cannot be described by valid data, the demand for supportive policies, cycling friendly planning and funding does not become as obvious as it deserves to be. Cycling still suffers from a comparably low attention in the public discourse. And I dare to reason that the invisibility is partly due to the absence of hard facts.
  2. Although there is still room for improvement, authorities invest in cycling infrastructure and promotion. But in most cases they are unable to asses the effect of their interventions. Sometimes punctual counts and surveys are done, but the systemic effects remains hidden in most cases. Chris Rissel and colleagues provided a nice example in 2015 for how local interventions impact punctual investigations, but tend to have a rather low systemic effect (click here internet for the whole study).
  3. Without knowing when, where and which cyclists are on the road, it is hard to efficiently influence and manage cycling traffic. Even more important, as long as reasons for why persons do not cycle and local or temporal particularities remain unknown, it is impossible to target these persons and promote cycling among them. In other words, we desperately need qualitative survey data in addition to quantitative data, such as GPS trajectories, if we want to acquire an integrated picture of cycling mobility.

Interestingly, the situation has been anticipated on EU level for several years. In early 2017, Thérèse Steenberghen and colleagues published an extensive report internet on data availability for active modes. However, even on the country level, they diagnosed a lack of comparable data about walking and cycling, not to speak of the local level.

Yes indeed, we need more data. But before lots of data, which have minor relevance or do not contribute to answering the questions raised above, are acquired, fundamental issues need to be tackled:

Which kind of data do we need to get a holistic image of cycling mobility, to describe influential factors and to identify interdependencies between them?

Which data sources do already exist and how can additional data be efficiently acquired?

How can the availability and accessibility of data be increased in order to make them useable?

How can heterogeneous data be harmonized with regard to different data models, technical specifications and semantics?

What are efficient ways to establish monitoring systems in order to generate time series?

What are appropriate scale levels for data acquisition and analysis?

With regard to these issues, it becomes evident that we do not simply need more data, but more data that are relevant and additionally, more data intelligence. We have therefore recently launched a 30 months research project called Bicycle Observatory internet, in which we aim to develop an integrated perspective on cycling mobility and to further differentiate between the very different preferences and behavior patterns among cyclists.
In order to achieve these research goals we are currently evaluating existing data sources and will eventually complement them with additional data sources. A special focus lies on qualitative data from social research. The idea is to connect them to quantitative data on the basis of a common geographic reference.

Although the consortium covers a broad range of competencies and the partners bring in extensive networks, we are more than open for collaborations. Please drop me a line internet if you are interested in sharing your ideas, data, questions or examples!


OSMaxx: the easy access to OSM data

OpenStreetMap internet is much more than a free map of the world. It’s a huge geo-database, which is still growing and improving in quality. OpenStreetMap is a great project in many respects!
But because it is a community project, where basically everyone can contribute, it has some particularities, which are rather uncommon in authoritative data sets. There, data is generated according to a pre-fixed data standard. Thus, (in an ideal world) the data are consistent in terms of attribute structure and values. In contrast, attribute data in OpenStreetMap can exhibit a certain degree of (semantic) heterogeneity, misclassifications and errors. The OSM wiki internet helps a lot, but it is not binding.
datamodel_networkAnother particularity of OpenStreetMap is the data model. Coming from a GIS background I was taught to represent spatial networks internet as a (planar) graph with edges and nodes. In the case of transportation networks, junctions are commonly represented by nodes and the segments between as edges. OpenStreetMap is not designed this way. Without going into details, the effect of OSM’s data model internet is that nodes are not necessarily introduced at junctions. This doesn’t matter for mapping, but for network analysis, such as routing!

In 2014 I presented internet and published internet an approach that deals with attributive heterogeneity in OSM data. Later I joined forces with Stefan Keller internet from the University of Applied Sciences in Rapperswil, Switzerland and presented our work internet at the AAG annual meeting 2015 in Chicago.
Since then Stefan and his team have lifted our initial ideas of harmonized attribute data to an entire different level. They formalized data cleaning routines, introduced subordinate attribute categories and developed an OSM export service, which generates real network graphs from OSM data. The result is just brilliant!


Two maps with very different scale made from the same data set.

The service can be accessed via osmaxx.hsr.ch internet. There, a login with an OSM account is required. Users can then choose whether they go with an existing excerpt or define an individual area of interest. In the latter case the area can be clipped on a map and the export format (from Shapefiles to GeoPackage to SQLite DB) and spatial reference system can be chosen. The excerpt is then processed and published on a download server. At this stage I came across the only shortcoming of the service: you don’t get any information that the processing of the excerpt takes up to hours (see here internet).
However, the rest of the service is just perfect. After “Hollywood has called” the processed data set can be downloaded from a web server.

OSMaxx interface.

OSMaxx interface.

osmaxx-downloadThe downloaded *.zip file contains three folders: data, static and symbology. The first contains the data in the chosen format. In the static folder all licence files and metadata can be found. The latter is especially valuable, because it contains the entire OSMaxx schema documentation. This excellent piece of work, which is the “brain” of the service is also available on GitHub internet. Those who are interested in data models and attribute structure should definitely have a look at this!
The symbology folder contains three QGIS map documents and a folder packed full with SVG map symbols. The QGIS map documents are optimized for three different scale levels. They can be used for the visualization of the data. I’ve tried them for a rather small dataset (500 MB ESRI File Geodatabase), but QGIS (2.16.3) always crashed. However, I think there is hardly any application context where the entire content of an OSM dataset needs to be visualized at once.

Of course, OSMaxx is not the first OSM export service. But besides the ease of use and the rich functionality (export format, coordinate system and level of detail), the attribute data cleaning and clustering are real assets. With this it is easy, for example, to map all shops in a town or all roads where motorized vehicles are banned. Using the native OSM data can make such a job quite cumbersome.
I have also tried to use the data as input for network analysis. Although the original OSM road data are transformed into a network dataset (ways are split into segments at junctions), the topology (connectivity) is invalid at several locations in the network. Before the data are used for routing etc., I would recommend a thoroughly data validation. For the detection of topological errors in a network see this post internet. Maybe a topology validation and correction routine can be implemented in a future version of OSMaxx.

In the current version the OSMaxx service is especially valuable for the design of maps that go beyond standard OSM renderings. But the pre-processed data are also suitable for all kinds of spatial analyses, as long as (network) topology doesn’t play a central role. Again, mapping and spatial analysis on the basis of OSM data was possible long before OSMaxx, but with this service it isn’t necessary to be an OSM expert and thus, I see a big potential (from mapping to teaching internet) for this “intelligent” export service.

Mysterious bicycle routing …

routechoiceThis is only a quick note on a recent observation I’ve made while using bicycle routing portals on the web. However, the relevance of data quality and implemented model routines becomes obvious very nicely. And because I’ve been struggling with these issues for quite a while now and things don’t necessarily turn to the better, I’m curious about your ideas on the following examples.

Imagine an absolutely normal situation in your daily mobility routines. You are at location A and you need to go to location B. Because you are a good guy, you choose the bicycle as your preferred mode of transport. What do you do? Of course you consult a routing service on the web, either via your desktop browser or mobile app.
But which service do you trust, which recommendations are reliable and relevant to you? Give it a try.

  1. For many people the big elephant Google Maps is their first choice . Whether you like it or not, Google has made a big leap forward with their bicycle routing service.


  2. Because you love OpenstreetMap and the GIScience group internet at Heidelberg University did a great job, you try the bicycle version of OpenRouteService. What you get is what you know from Google.


  3. If you consult another routing portal that is based on OSM data, you might get surprised. Naviki suggests the following route:


  4. So far we’ve tried a commercial service and two platforms which are fueled by crowd-sourced, open data. Let’s turn to authoritative data now. The goal of the federal routing service VAO is primarily the provision of a multi-modal routing service, with a focus on public transport. The bicycle version gives you this recommendation:


  5. The bicycle routing portal for the city and federal state of Salzburg, Radlkarte.info, is designed for the specific needs of utilitarian bicyclists. The data base is identical to the VAO service, but the result differs significantly.


The intention of this blog post is not to assess the quality (validity, reliability, relevance) of the routing recommendations as such. What I want to point to is the fact, that three different service, with different data sources in the back result in exactly the same routing recommendation, whereas services that are built upon the same data result in significantly different suggestions. That’s really mysterious. And it tells me, that the data and data quality is only one side of the medal. Obviously the parametrization of the routing engine and implemented model routines have a huge impact on the result. By the way, for all five examples, I’ve used the default settings.
Following the argument of the impact of parametrization and modelling, one can conclude that it is not so much about the data (they seem to be of adequate quality in all three cases), but about how well you know the user’s specific needs and preferences and turn this knowledge in appropriate models and services. Thus the next consequent step is to offer users the possibility to influence the parametrization of the routing engine in order to get what he or she expects to get: routing recommendations that perfectly fit their preferences.
Do you know routing services on the web that a allow for a maximum personalization (not only pre-defined categories)? To which degree would users benefit from personalized routing? And finally, would bicyclists use it at all? Let me know what you think and share your ideas!

#Polis15: mobility, sustainability and data

Returning from Brussels, I’m sitting in the train for 8 hours now and because the ICE is delayed since Stuttgart and I’m going to miss my connection train in Munich, I’ll have another 3 hours* until arriving in Salzburg. I’ve spend most of my “train-time” wrestling with a research paper which I need to rework for resubmission. Now it’s time to do something else. For example reflecting my 2 days at this year’s POLIS internet conference.

IMG_20151119_133022First things first: the conference was an awesome event at a very, very cool location. The conference organization was perfect. The same holds true for the opportunities to exchange, both face to face and in the Twitter sphere internet. The mix of participants from city authorities, researchers and practitioners resulted in a stimulating atmosphere with lots of inspirations, information and examples to learn from.

The overall topic of the conference was “Innovation in Transport for Sustainable Cities and Regions”. However, I’d say the conference (or to be fair, the sessions I’ve attended) was very much about how better data could help to better understand the complex phenomenon of urban mobility and how these insights lead to better services (not only apps!) for citizens. Right from the first session on the data topic was omnipresent: Dovile Adminaite from the ETSC internet pointed to the fact that risk calculations for vulnerable road users (VRU) are still hard to do because of the absence of sound exposure data. Well, this is a topic we are working on for quite a while. And as I’ve learned today, a recently started H2020 project, FLOW internet, deals exactly with this issue.

polis-data-workshopIn a very insightful workshop session, chaired by Yannis George from the Technical University of Athens, the data issue was at the center again. Alexandre Santacreu from TfL internet nicely showed how crucial the choice of exposure variables is for the interpretation of bicycle accidents. He came to the conclusion that only the distance travelled allows for sound risk calculations; inhabitants are crap, number of trips is tricky. Apart from the exposure variable Alexandre elaborated on how the level of spatial aggregation decides on the emerging risk patterns. My personal highlight in his presentation was the hexgrid map with disaggregated risk calculations for London – they reminded me of my own maps internet which I’ve recently presented in Hannover at the ICSC internet. The following presentation by Eric de Kievit (City of Amsterdam) also had a lot in common with what we have been doing internet for more than five years now. He presented a Safety Performance Indicator (SPI) which is used for the assessment of road networks. As Eric said, such modelling approaches are especially valuable when the data situation (accidents, exposure variables) is suboptimal. In turn – and we spend some time discussing this issue – it is hard to validate models and calculations in the absence of sound data. Véronique Feypell from the International Transport Forum internet finally presented the IRTAD database. Under the umbrella of the OECD data portal safety-relevant data are collected in a standardized way and subsequently harmonized. I’m looking forward to the updated and improved data resource!

What would be a conference these days without discussing smart cities? Actually this was the case in the opening plenary session. Commissioner Jyrki Katainen internet mentioned the special role of cities as driving forces for growth and innovation. This is exactly where Commissioner Katainen linked smart cities to smart citizens who are engaged in life-long learning (to be honest, I’ve never connected UNIGIS internet with smart cities, but maybe we should think about it …). After the welcome addresses a panel dealt with several aspects and connotations of smart cities. A recurring statement was that the wheel should not be invented multiple times and that we don’t need more technology and more research, but island solutions must be fused in order to generate values. Well, I clearly see the argument, but I think we need much more research! Maybe not necessarily on technology, but definitely on the social and ethical implications of the digitalization of the human sphere!

The last session of the first conference day was dedicated to data as an asset. It was opened by a brilliant contribution from Madrid. Sergio Fernandez shared EMT’s internet (Madrid’s PT operator) experiences with a radical open data approach. They publish all generated data as open data internet and currently witness how these data fuel a punch of newly developed, cool applications. The value generated by publishing data as Open (Government) Data was the take home message of my presentation which I gave in this session. In case you are interested, here are my slides:


beat-my-streetThe second conference started with a fireworks of best practice examples at the interface of ICT and active mobility. I got especially excited by the Beat my Street internet project from London, which is tightly connected to the Switch Project internet. The idea behind the project is rather simple, but the impact is huge. What I take home as key for a successful implementation is the move from a pure public health project (although this is exactly what it is) to a participatory, integrated community project, with fun and not health as the main promotion argument.

This project from London maybe illustrates best what became evident throughout the conference: cities and regions do have the capacity to make cities livable places and they are the driving forces for societal and technological transformations towards sustainability. But they need to have visions and the organizational and financial environment that stimulates the big leaps forward.

On a personal level, I’ve learned that several ideas we’ve been working on would perfectly correspond to past or currently running projects. Thus I can only say that I’d be more than happy if we could participate and contribute in the future. Please, don’t hesitate to use the contact form, get connected on Twitter internet or simply have a look at our department’s website internet.

[Update: I’ve added my Twitter timeline as a Storify dashboard internet]

* While writing this blog post my last train for today got delayed for another hour – too bad!

A very brief AGIT & GI-Forum 2015 review

Last week the twin conferences AGIT internet and GI-Forum internet took place in Salzburg, Austria. Once again it was a very intensive but stimulating event with great conversations, new contacts, nice social events and of course the everlasting struggle to choose the right session from an extensive offer of attractive parallel tracks. Whereas the general tenor of the keynotes was the increasingly tight relation between GIS and IT, my personal conference focus lay on spatial modelling and analysis in the context of transportation.

Searching the web you’ll find lots of personal reviews (this one internet by Anita is a great example!) and social media snippets (#AGIT2015 internet #GIForum2015 internet). Nevertheless here is a list of links you might find useful:

My conference week was dominated by the impressions from two keynotes I could attend (unfortunately I missed the other ones due to overlaps in the program) and my involvement in a double-session on transportation modelling (have a look at my recent post internet), the OpenStreetMap special forum internet and the track on Austria’s harmonized road graph, GIP internet.

simonisIn Tuesday’s keynote Ingo Simonis internet from OGC talked about the role of standards in the context of smart cities. His motivation to argue for establishing geospatial intelligence (… and with this standards) in enormously fast growing urban agglomerations is the correlation between size and opportunities/challenges: “The bigger a city, the more of everything is there.” A geospatial framework of connected devices is thereby regarded as part of sustainable solutions that turn these vibrant, urban hot spots into smart cities. As in nearly every presentation on smart cities Songdo in South-Korea served as role model and poster child of Ingo’s argumentation (a reference I personally find not that convincing – but this would be an entirely different discussion on liveable vs. smart cities).
What I found really intriguing was Ingo’s elaborations on the “social” aspect of standards. Until recently standards were more a bone dry threat than anything else to me. But Ingo made a very important notion on that: he illustrated how standards are, as he put it, the distilled wisdom of people with expertise in their respective field. In other words, standards don’t necessarily define in advance how things have to be done, but are recommendations or a framework for activities that are already established … Standards are about a common understanding and language of domain knowledge and practise.

privacy_twitter_strobl The second keynote on Wednesday morning came right from the opposite spectrum of the handling of large data amounts, or better data stream. Manfred Hauswirth internet gave an inspiring overview of what is currently going on in the field of linked data and what’s the role of GIS in the never-ending stream of data, semantic relations and interdependencies. He spoke of the internet of everything where the most relevant thing (above all in terms of business models) is to extract useful information from data; something Manfred called a rather untapped resource. Four take home messages made it into my notepad:

  1. Linking is the new (Is it really new? Actually this is how our human brains have worked for millennia) paradigm in the handling of data sets/streams.
  2. Data are increasingly dynamic. This is why the whole processing needs to be designed adaptable.
  3. As geonames are central to the semantic web, geospatial data and knowhow are of great importance.
  4. Privacy is gone. The latter point was of course not revolutionary or new. But it was the first time I heard this statement explicitly and without any dilution in a keynote on a GI conference (probably because the keynote speaker has a background in computer science) – normally we hear bloomy mantras such as: “GIS helps to make the earth a better place.” blablabla. Maybe the organizers of next year’s GI-Forum could invite a philosopher as keynote speaker, talking about the responsibility we have in science and IT!

As the years before, a highlight of the German-speaking conference, the AGIT, was the OpenStreetMap special forum, organized by TraffiCon internet. This year I had the chance to contribute actively; it was a great honor to got invited for a presentation on the suitability of OSM and OGD data for network modelling and analysis. Here are the slides of my presentation (sorry, German language), which I think are self-explanatory and don’t need any further comments:


gipday2015Speaking at the OSM special forum the day before, it was a somehow exotic experience to give another presentation in a session dedicated to authoritative road data on Thursday morning. Since 6 years (with several more years with preparatory projects) all administrative bodies in Austria edit and manage their road-related data in the so called Graphenintegrationsplattform, GIP (engl. harmonized road graph). This standard allows for nationwide applications and prevents from cost-intensive data redundancies within administrations.
We’ve been working for quite a while with GIP data in the context of bicycle routing. Currently the web application http://www.radlkarte.info internet is based on authoritative road data. Over the last two years the quality of the GIP data has been significantly increased. But still, there are some critical issues that become evident when the data are used in an operative environment. This is why we have developed several quality control routines considering above all topology and attributes. The latter is important for (spatial) modelling approaches with which the data are interpreted and fitted into the specific application context. With this parallel approach – quality testing plus modelling – the reliability and robustness of the data could have been significantly increased, as I demonstrated in my presentation:


Any comments and questions? I’m looking forward to read and learn from you!

Some thoughts on GIS and transportation modelling

Transportation modelling is a well established domain with dedicated experts and sophisticated software packages. Still, we thought it could be worth to take a closer look on it from an explicit spatial perspective. This is why Gudrun internet and I have organized a special session entitled “Spatial perspective on transportation modelling” at this year’s GI-Forum conference  (http://gi-forum.org internet).
We had a session with five short presentations and an extended joint discussion and a workshop session. This very brief summary simply serves as a reminder of some of the major issues that were raised.

The paper session on Wednesday was a real personal highlight. Not only the presentations were inspiring, but the audience was big and active. We had presentations from various fields, covering quite a broad range of topics (all papers are online internet as open access):

1) Gudrun provided insights into a first version of an agent-based bicycle flow model, where she demonstrated how aggregated flows emerge from the individual behaviour of numerous agents in space and time. One of the major conclusion was that while the model as such seems to generate feasable results, the validation is rather tricky since the necessary data are hardly available.

2) Christoph internet gave an excellent presentation on how to link the abstract model space with the geographical space and the model steps with a temporal continuum. Additionally he presented his approach to speed up the model performance when it contains routing functionalities. With an intelligent network simplification he was able to run the simulation 12 times faster than with the initial network graph.

3) Somehow connected to the preceding two presentations, Johannes internet gave an introduction to cognitive agents as counterparts of selfish agents, which are assumed in most routing and navigation applications. With regard to current transportation models, Johannes estimated that those models might be more accurate and thus more meaningful when “smart” agents are incorporated.

4) Leaving the field of agent-based models, Rita internet answered the question what geographers could contribute to transportation modelling in a very beautiful (literally!) way. Working on the TAPAS internet traffic model she emphasized the role of visualization for the validation and communication of the model results. Especially the spatial context of a map helps to make sense of what the model calculates and how it actually works.

5) In the last presentation of this session the award winner of the AGEO student award, Daniel Steiner internet presented parts of his master thesis where he worked with real-time data from public transit. What became very clear in this presentation was, that it is hard to find PT companies that provide real-time data and that it is even harder to use these data in models and analyses because of quality issues.

In a second session, that was organized in a workshop format, three topics that were raised in the presentations and the joint discussion were further worked on:


workshop_groupIn the very active small working groups, it quickly turned out that we as geographers do have something to contribute to the domain of transportation modelling and that there is still a lot of work to do!
In the context of data for transportation models these points were – among others – briefly discussed:

  • There are lots of static data available, mostly following an established standard. Although the number of sensors is skyrocketing they are less likely accessible; at least in many parts of the world. Additionally there are numerous standards for all kinds of sensor data, what makes it cumbersome to integrated data from different source in one and the same model. Beside measured data there are also calculated or estimated data, such as interpolations. For such data hardly any standard exists; most often these data are a kind of black box where you don’t know how they were generated.
  • The latter factor directly leads to the urgent need of sound metadata for transportation data and derived products. It is of crucial importance to know under which circumstances and for what purpose data were captured. For the interpretation of derived data (e.g. flow volumes) it is necessary to know how they were calculated etc. Without providing such information the reliability of modelling results suffers enormously.
  • An interesting observation was that whereas most often spatial data are used as inputs for transportation models, the models themselves are non-spatial, meaning that the relation between the model objects is abstract and not geographically defined.
  • Concerning the scale and aggregation level of data a rather pragmatic rule of thumb emerged: data availability, the availability of tools, processing power and the research question decide on what data are being used.

From the group working on ABM and cognitive agents a rather straight forward research agenda was drafted. The group started from three distinct characteristica of agent-based models: exploration of cause-effect relations, non-intuitive phenomena at system level, local scale. From there, the group identifed three areas of research.

  • How to shift between scales and model types (top-down vs. bottom-up)?
  • How does ‘smart’ behaviour of cognitive agents impact traffic flows on a broader scale?
  • How can the performance issue be dealt with in a reasonable way?

The third group worked on the role of geovisualization and came up with a nice paradigmatic (in the cartography community) conclusion: maps and geovisualizations are not only for communicating (one way) results but they serve as capable interface for the exploration of and interaction with the data and the model. Besides, maps and map-related visualizations put transportation models into an explicit spatial context. Thus the model and the results can be related to the environment what on the one hand can explain results and on the other hand generates new hypothesis for further investigations. At least two issues were regarded as yet unsolved:

  • How to determine the appropriate trade-off between complexity (information load) and simplicity in geovisualizations?
  • How to design visualization environments that are flexible and adaptable to facilitate real multi-perspective approaches?

Some of the aspects we were working on are documented on these flipcharts.


Of course there is lot more to work on. And that’s exactly what we are going to do now. If you want to contribute or have comments on the few points raised here, just leave me a note. I’d be more than happy to learn from you and extend the group of geographers and GIS experts that strive to contribute their spatial know how to transportation models. Such an interdisciplinary approach is, from my point of view, especially valuable were established transportation models have fallen short so far and that is in the field of active transport.

Spatial modelling with OGD and OSM data

While Open Government Data internet are currently a big deal in the German-speaking countries, the OpenStreetMap project celebrates its 10th anniversary internet. How these different data sources can be dealt with in spatial modelling approaches and how they can even be used in combination were the two major topics of a presentation, I’ve given last friday at a UNIGIS workshop internet in Salzburg.

Spatial modelling allows for interpreting and relating data for specific applications, without necessarily manipulating them. Neglecting this option and building applications directly on databases can result in rather weird and/or useless results. The reason for this is simple: generally data are captured for a certain purpose. Naturally, this purpose decides on the data model, the attribute structure or the data maintenance. And these determining factors might diverge from the requirements of the intended application.

Data are like screws (or any other basic element) which can be used for various machines/final products. But how screws (and all the other elements) are arranged is not an inherent characteristic. A plan (model) is needed in order to get the intended product.

In the case of OGD the published data are made available by different public agencies. For example the responsible department is obliged by law to monitor air quality and, in case, intervene efficiently. Thus different parameters are sensed for this very purpose. When these data are being published as OGD one can, for example, use them for building a “health map”. But in such a case the direct visualization of micrograms and PPMs of the sensed pollutants wouldn’t make much sense. The data need to be interpreted, aggregated, classified, related – in short – modelled in order to fit the intended purpose of the map.
A similar mechanism holds true for data from the OpenStreetMap project. Originally the data were mapped for the purpose of building a free world map. Meanwhile the extent of the database has grown enormously and the data can be used for much more sophisticated applications than a “simple” world map. But again, if the data – and especially the attributes – which were originally collected for a specific purpose are being used in any other context, they have to be processed and modelled.

When applications are built on not only one dataset which was originally created for a different purpose, but on several datasets (e.g. because the data availability ends at the border of an administrative unit), the necessity of modelling is given anyway. As an example I’ve referred to our current work in the context of the web application Radlkarte internet.
Here it was necessary to combine authoritative data (mainly published as OGD) with crowd-sourced data. Because of the fundamental differences between these data sources – concerning the data model, attribute structure, data quality and the competence for data management – evaluation and correction routines, as well as an extensive modelling workflow had to be implemented. But, as it could have been demonstrated in the presentation, this effort pays off significantly when the validity and plausibility of the results are being examined.
Geographical information systems (GIS) are intuitive and performing environments for the implementation of such multi-stage workflows. They allow for the data storage and management in spatial databases, provide modelling interfaces and facilitate immediate analysis and visualization capacities.