ArcGIS Developers

ArcGIS API for Python

Geo-Referencing and Digitization of Scanned Maps


This sample guide explains the steps for automatically geo-referencing and digitizing scanned maps. Some of the important terms used in this guide are as follows:

  • Scanned Map: Refers to the digital, scanned copy of a scientific paper map.
  • Geo-referencing: Refers to the process of assigning real-world coordinates to pixels of the scanned map.
  • Digitizing: Refers to the process of converting geo-referenced data to digital format(shapefile).

Valuable Geo-spatial information is contained in a wide variety of maps available in the form of images. Unfortunately, we can’t analyse this data without digitizing it. The conventional approach is to manually extract data and store it in a digital format. This guide details a method using computer vision techniques to automatically extract details from a set of scanned maps, find reliable feature points (control points), register the maps to different parts of the world, and ultimately overlay the valuable Geo-spatial data depicted in these maps onto the search region image (for instance, world map).

This guide illustrates the stepwise use of six APIs that are capable of geo-referencing and digitizing a scanned map image onto a search region image (for instance, world map).

Data Used

The data used in this guide are the scanned maps extracted from the series of handbooks called Mammals of the World, a series which contains the information of species across the world, including regions where the species are found. We extracted the images from the scanned book in order to Geo-reference and digitize the species on the search region image (for instance, world map).

Below are some sample scanned maps with depicted species' region:

(i) Bandicota bengalensis (Bandicoot Rat)
(ii) Saccostomus campestris (Southern African Pouched Mouse)
(iii) Tylomys watsoni (Watson’s Climbing Rat)

Implementation in arcgis.learn

Let's see how scanned maps are geo-referenced and digitized with arcgis.learn.


Import the ScannedMapDigitizer class from arcgis.learn module.

In [ ]:
from arcgis.learn import ScannedMapDigitizer

Object initialization

Below are the parameters to be passed into ScannedMapDigitizer:

  • input_folder: The path to the folder that contains scanned map images

  • output_folder: The path to the folder where intermediate results and output should get generated with image name

Note: Intermediate results are the outputs from the below-mentioned methods which are required for the following steps as input.

In [ ]:
smd = ScannedMapDigitizer("path_to_scanned_maps", "path_to_output_folder")

Create Masks

This method extracts binary mask images using the scanned maps corresponding to color inputs fed by the user.