Import OpenStreetMap data to PostgreSQL with Imposm3



Introduction

If you want to import OpenStreetMap data to PostgreSQL (+ PostGIS) database two popular tools are osm2pgsql and imposm3.

Both were designed to prepare data for rendering although some time ago osm2pgsql was upgraded with scripting capabilities that go beyond simple mapping files that specify what tables to create and what objects should be inserted there (filter by tags).

There is also osmosis if you want a - more or less - copy of OpenStreetMap database.

imposm3 development was put on hold due to lack of funding while osm2pgsql is actively developed. This makes it questionable choice to use imposm3 but it does have some nice qualities that can make it better for some projects.

In my opinion it’s a very good tool that is useful when:

Subjective evaluation:

Pros:

Cons:

Setup is very simple. You download the executable, prepare config and run the executable to import the data to the database. Then you can run the executable with a different flag/command to automatically download and apply OSM data updates from replication log or even set it up as a service/daemon that will run in the background and keep your data updated.

No dependencies, no need for external tools - like osmosis that is usually paired with osm2pgsql - just a nice single executable.

I compared performance of both tools long time ago and probably osm2pgsql got faster since then but for me import of some specific data for Poland was about 45 minutes for osm2pgsql and 15 minutes for imposm3.

Example usage

I’ll list commands make an import of some data. This requires Linux based OS (WSL works well). I’ll launch docker container with PostgreSQL database but you can use any other way to set up a database.

First let’s make a working directory:

mkdir imposm-test
cd imposm-test

Then let’s download the binary executable:

wget https://github.com/omniscale/imposm3/releases/download/v0.11.1/imposm-0.11.1-linux-x86-64.tar.gz
tar -xvf imposm-0.11.1-linux-x86-64.tar.gz

We’ll download some small OSM extract as well as geojson file that will limit the import area:

wget http://download.geofabrik.de/europe/poland/opolskie-latest.osm.pbf
wget https://github.com/openstreetmap-polska/gugik2osm/raw/main/imposm3/poland.geojson

For this example we’ll import buildings and roads. Full docs are in https://imposm.org/docs/imposm3/latest/mapping.html

Create YAML file mapping.yaml with your preferred text editor.

tags:
  load_all: true
areas:
  area_tags: [building, landuse, leisure, natural, aeroway, amenity, shop, "building:part", boundary, historic, place, "area:highway", craft, office, public_transport, tourism, allotments, club, "demolished:building", "abandoned:building", healthcare, industrial, residential]
  linear_tags: [highway, barrier, route]
tables:
  buildings:
    type: polygon
    columns:
    - name: osm_id
      type: id
    - name: geometry
      type: geometry
    - key: building
      name: building
      type: string
    - key: building:levels
      name: levels
      type: string
    - key: roof:shape
      name: roof_shape
      type: string
    - key: building:flats
      name: flats_number
      type: string
    - key: building:levels:underground
      name: levels_underground
      type: string
    - key: height
      name: height_above_ground
      type: string
    mapping:
      building: [__any__]
  roads:
    type: linestring
    columns:
    - name: osm_id
      type: id
    - key: highway
      name: highway
      type: string
    - name: tags
      type: hstore_tags
    - name: geometry
      type: geometry
    filters:
      reject:
        area: ["yes"]
    mapping:
      highway: [__any__]

Create config.json with a text editor.

Here’s an example, adjust paths and connections details accordingly.

srid means EPSG code number (i.e. id of CRS). It can be 4326 (geometry uses degrees as unit) or 3857 (units are meters).

{
    "cachedir": "/home/tt/imposm-test/imposm_cache/",
    "connection": "postgis://postgres:1234@localhost:25434/gis",
    "limitto":  "/home/tt/imposm-test/poland.geojson",
    "mapping":  "/home/tt/imposm-test/mapping.yaml",
    "srid": 4326,
    "diffdir":  "/home/tt/imposm-test/imposm_diff/"
}

If you don’t need the data to be extremely fresh you can set update interval to be 1 hour by adding these lines to config.json

    "replication_url": "https://planet.openstreetmap.org/replication/hour/",
    "replication_interval": "1h"

If you want to create temporary PostgreSQL database using docker run this command:

docker run --name "postgis25434" --shm-size=4g -e MAINTAINANCE_WORK_MEM=1024MB -p 25434:5432 -e POSTGRES_USER=postgres -e POSTGRES_PASS=1234 -e POSTGRES_DBNAME=gis -d -t kartoza/postgis

Now let’s run data import:

./imposm-0.11.1-linux-x86-64/imposm import -config ./config.json -read ./opolskie-latest.osm.pbf -overwritecache -write -diff -deployproduction

Note: When running import again existing data will be moved to backup schema.

All options are described in more details in the docs.

You can connect to the database and query the data using software like QGIS or just psql e.g.:

psql -h localhost -p 25434 -U postgres -d gis

To update the data run the following command:

./imposm-0.11.1-linux-x86-64/imposm run -config ./config.json

This will run continuously until you stop it with CTR+C.

Making systemd service

Example of making systemd service on Ubuntu.

Let’s create a script (you can name it: “imposm_run.sh”) that will launch data updating process:

#!/bin/bash

<specify directory>/imposm-0.11.1-linux-x86-64/imposm run -config <specify directory>/config.json

Our service will start after PostgreSQL service starts. When using external database e.g. RDS that would not be desired. In such case remove After=postgresql.service from service file.

Create “imposm.service” file:

[Unit]
Description=IMPOSM3 service that keeps your copy of OpenStreetMap data updated.
After=network.target
After=postgresql.service
StartLimitIntervalSec=0

[Service]
Type=simple
Restart=always
RestartSec=5
User=<specify user>
ExecStart=/bin/bash "<specify directory>/imposm_run.sh"

[Install]
WantedBy=multi-user.target

Copy service file to correct location, set its permissions, and reload service daemon:

sudo cp <specify directory>/imposm.service /etc/systemd/system/imposm.service
sudo chmod 600 /etc/systemd/system/imposm.service
sudo systemctl daemon-reload

Now we can use these commands to start and stop our service:

sudo service imposm start
sudo service imposm stop

Tips for production use

Process that updates our database generates some data that stays on disk. If left without supervision this will grow in size indefinitely.

You can add these commands in cron to keep disk usage low.

# deleting osm replication files downloaded more than 7 days ago
find <specify directory>/imposm_diff/0* -mtime +7 -type f -delete
# remove empty directories
find <specify directory>/imposm_diff/0* -empty -type d -delete
# most of imposm logs are printed to system log by default
# you can clean it up with this command
journalctl --vacuum-time=10d

Additionally I would recommend to run VACUUM ANALYZE in Postgres periodically.

After service has been running for a long time shutting it down temporarily and “defragmenting” data by running VACUUM FULL then ANALYZE may improve disk usage and performance.