Using OSM building footprints to disaggregate OGD census data

The number of available data sets published as Open Data (OD) and Open Government Data (OGD) is constantly growing internet. That’s incredibly cool, because you can do analyses that were impossible a few years ago. Today I’d like to show you how you can use building footprints from OpenStreetMap and census data from an OGD portal to generate a population grid with any spatial resolution.

Here is the reason for why it’s worth to go through a few analysis steps instead of using what’s available anyway. At least in Austria (I know, the situation is quite different in the US) nationwide census data are only freely available on the level of municipalities. Now, everyone is aware of the fact that the population is commonly not equally distributed within rather arbitrarily defined administrative units; especially in the case of large, rural municipalities. Instead, the population is more or less spatially clustered.
For many analyses population data on the level of municipalities are way to coarse. Take for example the calculation of service areas for central facilities in order to estimate the potential coverage (“How many people live within 5 driving minutes?” etc.). Until recently you were forced to buy expensive statistical data from the federal bureau of statistics, Statistik Austria internet, in order to answer such questions. What you get there are aggregated census data in 250, 500 or 1000 meter raster grids.
Fortunately, enough data are published today as OD and OGD to bypass this limitation. Of course, the resulting population raster from the approach presented below, is only an approximation (similar to dasymetric maps internet). But for a first estimation it’s enough and it is for free!

Basic idea behind the disaggregation approach. Population data are distributed proportionally to the location and size of the building footprints.

Basic idea behind the disaggregation approach: population data are distributed proportionally to the location and size of the building footprints.

Here is how you can generate disaggregated population grids based on OSM data and demographic OGD:

Disaggregated census data for the city of Braunau (Upper Austria).

Disaggregated census data for the city of Braunau (Upper Austria).

  1. Download administrative boundaries, including available census data. For Austria you’ll find everything via the national OGD portal internet.
  2. Download building footprints from OpenStreetMap. I prefer QGIS and the QuickOSM plugin internet for this task, because OSM data are immediately converted to a geospatial dataset (e.g. Shapefile).
  3. Transfer all datasets into a projected coordinate system; the calculation of areas is more convenient this way.
  4. Select (building = *) all building footprints that are not used for residential purposes and remove them from your analysis layer.
  5. Calculate the share (r) of the total building footprint area for each building: r=\frac{a_{i}}{\sum_{a=1}^{n}a_{i}}
  6. Select all buildings within the respective administrative unit and multiply the population data with the share of each building.
  7. Generate a regular grid, which covers the entire area (MMQGIS plugin internet for QGIS, hexgrid script internet for ArcGIS).
  8. Assign the estimated population data of the building footprints to each grid cell.
  9. Done. What you have is a rough estimation of the population distribution.

Although the results are fairly reliable, at least two issues negatively affect the result. First, building footprints don’t account for multi-storey buildings. Theoretically the number of storeys can be tagged in OpenStreetMap, but this is hardly ever done. Second, data inaccuracies bias the result. In OSM many buildings are not adequately tagged (e.g. commercial buildings should be tagged as such) and even worse, some buildings are not mapped yet. Nevertheless for many questions the approximation is sufficient.

Building footprints are used for the disaggregation of freely available census data.

Building footprints (from OSM) are used for the disaggregation of freely available census data (OGD).

This simple piece of GIS analysis demonstrates the power of GIS on the one hand and the large benefit of Open (Government) Data on the other. Try it yourself – I’m looking forward reading about your experience!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s