The number of available data sets published as Open Data (OD) and Open Government Data (OGD) is constantly growing . That’s incredibly cool, because you can do analyses that were impossible a few years ago. Today I’d like to show you how you can use building footprints from OpenStreetMap and census data from an OGD portal to generate a population grid with any spatial resolution.
Here is the reason for why it’s worth to go through a few analysis steps instead of using what’s available anyway. At least in Austria (I know, the situation is quite different in the US) nationwide census data are only freely available on the level of municipalities. Now, everyone is aware of the fact that the population is commonly not equally distributed within rather arbitrarily defined administrative units; especially in the case of large, rural municipalities. Instead, the population is more or less spatially clustered.
For many analyses population data on the level of municipalities are way to coarse. Take for example the calculation of service areas for central facilities in order to estimate the potential coverage (“How many people live within 5 driving minutes?” etc.). Until recently you were forced to buy expensive statistical data from the federal bureau of statistics, Statistik Austria , in order to answer such questions. What you get there are aggregated census data in 250, 500 or 1000 meter raster grids.
Fortunately, enough data are published today as OD and OGD to bypass this limitation. Of course, the resulting population raster from the approach presented below, is only an approximation (similar to dasymetric maps ). But for a first estimation it’s enough and it is for free!
Here is how you can generate disaggregated population grids based on OSM data and demographic OGD:
- Download administrative boundaries, including available census data. For Austria you’ll find everything via the national OGD portal .
- Download building footprints from OpenStreetMap. I prefer QGIS and the QuickOSM plugin for this task, because OSM data are immediately converted to a geospatial dataset (e.g. Shapefile).
- Transfer all datasets into a projected coordinate system; the calculation of areas is more convenient this way.
- Select (building = *) all building footprints that are not used for residential purposes and remove them from your analysis layer.
- Calculate the share (r) of the total building footprint area for each building:
- Select all buildings within the respective administrative unit and multiply the population data with the share of each building.
- Generate a regular grid, which covers the entire area (MMQGIS plugin for QGIS, hexgrid script for ArcGIS).
- Assign the estimated population data of the building footprints to each grid cell.
- Done. What you have is a rough estimation of the population distribution.
Although the results are fairly reliable, at least two issues negatively affect the result. First, building footprints don’t account for multi-storey buildings. Theoretically the number of storeys can be tagged in OpenStreetMap, but this is hardly ever done. Second, data inaccuracies bias the result. In OSM many buildings are not adequately tagged (e.g. commercial buildings should be tagged as such) and even worse, some buildings are not mapped yet. Nevertheless for many questions the approximation is sufficient.
This simple piece of GIS analysis demonstrates the power of GIS on the one hand and the large benefit of Open (Government) Data on the other. Try it yourself – I’m looking forward reading about your experience!