While Open Government Data are currently a big deal in the German-speaking countries, the OpenStreetMap project celebrates its 10th anniversary . How these different data sources can be dealt with in spatial modelling approaches and how they can even be used in combination were the two major topics of a presentation, I’ve given last friday at a UNIGIS workshop in Salzburg.
Spatial modelling allows for interpreting and relating data for specific applications, without necessarily manipulating them. Neglecting this option and building applications directly on databases can result in rather weird and/or useless results. The reason for this is simple: generally data are captured for a certain purpose. Naturally, this purpose decides on the data model, the attribute structure or the data maintenance. And these determining factors might diverge from the requirements of the intended application.
In the case of OGD the published data are made available by different public agencies. For example the responsible department is obliged by law to monitor air quality and, in case, intervene efficiently. Thus different parameters are sensed for this very purpose. When these data are being published as OGD one can, for example, use them for building a “health map”. But in such a case the direct visualization of micrograms and PPMs of the sensed pollutants wouldn’t make much sense. The data need to be interpreted, aggregated, classified, related – in short – modelled in order to fit the intended purpose of the map.
A similar mechanism holds true for data from the OpenStreetMap project. Originally the data were mapped for the purpose of building a free world map. Meanwhile the extent of the database has grown enormously and the data can be used for much more sophisticated applications than a “simple” world map. But again, if the data – and especially the attributes – which were originally collected for a specific purpose are being used in any other context, they have to be processed and modelled.
When applications are built on not only one dataset which was originally created for a different purpose, but on several datasets (e.g. because the data availability ends at the border of an administrative unit), the necessity of modelling is given anyway. As an example I’ve referred to our current work in the context of the web application Radlkarte .
Here it was necessary to combine authoritative data (mainly published as OGD) with crowd-sourced data. Because of the fundamental differences between these data sources – concerning the data model, attribute structure, data quality and the competence for data management – evaluation and correction routines, as well as an extensive modelling workflow had to be implemented. But, as it could have been demonstrated in the presentation, this effort pays off significantly when the validity and plausibility of the results are being examined.
Geographical information systems (GIS) are intuitive and performing environments for the implementation of such multi-stage workflows. They allow for the data storage and management in spatial databases, provide modelling interfaces and facilitate immediate analysis and visualization capacities.