ArcGIS Developers
Dashboard

ArcGIS API for Python

Finetuning Pre-trained Building Footprint Model

  • 🔬 Data Science
  • 🥠 Deep Learning and Instance Segmentation

Introduction

ArcGIS Living Atlas hosts a variety of pre-trained models. While these models work well on geography that the model's training data was exported from, they may not perform well on other geographies.

However, we can improve the performance of these models on different geographies by finetuning the model on our own training data. When compared to training a similar model from scratch, this process will save time, is computationally less intensive, and will provide more accurate results.

In this workflow, we will perform three broad steps.

  • Load the training data
  • Finetune a pre-trained model
  • Deploy the model and extract footprints

This workflow requires deep learning dependencies to be installed. Documentation is available here that outlines how to install and setup an appropriate environment.

Load training data

In [1]:
from arcgis.gis import GIS
gis = GIS('home')
portal = GIS('https://pythonapi.playground.esri.com/portal')
In [2]:
training_data = gis.content.get('5351aca735604197ac8d8ede45f6cc4b')
training_data
Out[2]:
building_footprints_kuwait_osm_sample
building_footprints_kuwait_osm_sampleImage Collection by api_data_owner
Last Modified: August 10, 2021
0 comments, 0 views
In [5]:
filepath = training_data.download(file_name=training_data.name)
In [12]:
import zipfile
from pathlib import Path
with zipfile.ZipFile(filepath, 'r') as zip_ref:
    zip_ref.extractall(Path(filepath).parent)
In [17]:
data_path = Path(filepath).parent / 'building_footprints'
In [6]:
from arcgis.learn import prepare_data
data = prepare_data(data_path, 
                    batch_size=16, 
                    chip_size=400)
Please check your dataset. 3 images dont have the corresponding label files.

Visualize training data

To get a sense of what the training data looks like, use the show_batch() method to randomly pick a few training chips and visualize them. The chips are overlaid with masks representing the building footprints in each image chip.

In [20]:
data.show_batch(rows=4)