Detecting settlements using supervised classification and deep learning

  • 🔬 Data Science
  • 🥠 Deep Learning and pixel-based classification

Introduction

Deep-Learning methods tend to perform well with high amounts of data as compared to machine learning methods which is one of the drawbacks of these models. These methods have also been used in geospatial domain to detect objects [1,2] and land use classification [3] which showed favourable results, but labelled input satellite data has always been an effortful task. In that regards, in this notebook we have attempted to use the supervised classification approach to generate the required volumes of data which after cleaning was used to come through the requirement of larger training data for Deep Learning model.

For this, we have considered detecting settlements for Saharanpur district in Uttar Pradesh, India. Settlements have their own importance to urban planners and monitoring them temporally can lay the foundation to design urban policies for any government.

Necessary imports

%matplotlib inline

import pandas as pd
from datetime import datetime
import matplotlib.pyplot as plt

import arcgis
from arcgis.gis import GIS
from arcgis.geocoding import geocode
from arcgis.learn import prepare_data, UnetClassifier
from arcgis.raster.functions import clip, apply, extract_band, colormap, mask, stretch
from arcgis.raster.analytics import train_classifier, classify, list_datastore_content

Connect to your GIS

gis = GIS(url='https://pythonapi.playground.esri.com/portal', username='arcgis_python', password='amazing_arcgis_123')

Get the data for analysis

Search for Multispectral Landsat layer in ArcGIS Online. We can search for content shared by users outside our organization by setting outside_org to True.

landsat_item = gis.content.search('title:Multispectral Landsat', 'Imagery Layer', outside_org=True)[0]
landsat = landsat_item.layers[0]
landsat_item
Multispectral Landsat
Landsat 8 OLI, 30m multispectral and multitemporal 8-band imagery, with on-the-fly renderings and indices. This imagery layer is sourced from the Landsat on AWS collections and is updated daily with new imagery.Imagery Layer by esri_livingatlas
Last Modified: April 24, 2019
0 comments, 2 views

Search for India State Boundaries 2018 layer in ArcGIS Online. This layer has all the District boundaries for India at index - 2.

boundaries = gis.content.search('India District Boundaries 2018', 'Feature Layer', outside_org=True)[2]
boundaries
India District Boundaries 2018
This layer shows the District level boundaries of India in 2018. The boundaries are optimized to improve Data Enrichment analysis performance.Feature Layer Collection by esri_livingatlas
Last Modified: April 24, 2019
0 comments, 0 views
district_boundaries = boundaries.layers[2]
district_boundaries
<FeatureLayer url:"https://pythonapi.playground.esri.com/portal/sharing/servers/12fc8ff238e7420ea43906a9a33a9aab/rest/services/IND_Boundaries_2018/FeatureServer/2">
m = gis.map('Saharanpur, India')
m.add_layer(district_boundaries)
m.legend = True

As this notebook is to detect settlements for Saharanpur district, you can filter the boundary for Saharanpur.

area = geocode("Saharanpur, India", out_sr=landsat.properties.spatialReference)[0]
landsat.extent = area['extent']

We want to detect settlements for Saharanpur district, so we will apply a query to the boundary layer, by setting "OBJECTID = 09132". The code below brings imagery and feature layer on the same extent.

saharanpur = district_boundaries.query(where='ID=09132')   # query for Saharanur district boundary
saharanpur_geom = saharanpur.features[0].geometry          # Extracting geometry of Saharanpur district boundary   
saharanpur_geom['spatialReference'] = {'wkid':4326}        # Set the Spatial Reference 
saharanpur.features[0].extent = area['extent']             # Set the extent

Get the training points for training the classifier. These are 212 points in total marked against 5 different classes namely Urban, Forest, Agricultural, Water and Wasteland. These points are marked using ArcGIS pro and pulished on the gis server.

data = gis.content.search('classificationPointsSaharanpur', 'Feature Layer')[0]
data
classificationPointsSaharanpur
Feature Layer Collection by api_data_owner
Last Modified: January 21, 2020
0 comments, 3 views

Filter satellite Imagery based on cloud cover and time duration

In order to produce good results, it is important to select cloud free imagery from the image collection for a specified time duration. In this example we have selected all the images captured between 1 October 2018 to 31 December 2018 with cloud cover less than or equal to 5% for Saharanpur region.

selected = landsat.filter_by(where="(Category = 1) AND (cloudcover <=0.05)",
                             time=[datetime(2018, 10, 1), 
                                   datetime(2018, 12, 31)],
                             geometry=arcgis.geometry.filters.intersects(area))
df = selected.query(out_fields="AcquisitionDate, GroupName, CloudCover, DayOfYear", 
                    order_by_fields="AcquisitionDate").sdf
df['AcquisitionDate'] = pd.to_datetime(df['AcquisitionDate'], unit='ms')
df
OBJECTIDAcquisitionDateGroupNameCloudCoverDayOfYearShape_LengthShape_AreaSHAPE
023607482018-10-02 05:18:12LC81460392018275LGN00_MTL0.0411275863211.7509254.652535e+10{'rings': [[[8842440.179299999, 3627905.796800...
123607492018-10-02 05:18:36LC81460402018275LGN00_MTL0.0007275851821.8089284.530870e+10{'rings': [[[8799651.220800001, 3442773.486500...
223652292018-10-09 05:24:26LC81470392018282LGN00_MTL0.0235282864249.0054564.664003e+10{'rings': [[[8670522.289099999, 3627673.343800...
323652302018-10-09 05:24:50LC81470402018282LGN00_MTL0.0104282852038.7096614.533114e+10{'rings': [[[8627077.993299998, 3442759.749600...
423637652018-10-25 05:24:31LC81470392018298LGN00_MTL0.0034298864003.4800544.661440e+10{'rings': [[[8668674.6948, 3627854.217100002],...
523637662018-10-25 05:24:55LC81470402018298LGN00_MTL0.0084298852052.9263724.533265e+10{'rings': [[[8625255.243900001, 3442931.826300...
623747792018-10-27 05:12:34LC81450402018300LGN00_MTL0.0003300851500.2963164.527314e+10{'rings': [[[8969606.0125, 3442791.7321000025]...
723823622018-11-10 05:24:34LC81470392018314LGN00_MTL0.0009314863994.2765194.661226e+10{'rings': [[[8669263.2619, 3627728.654100001],...
823839682018-11-12 05:12:36LC81450402018316LGN00_MTL0.0001316851603.2540734.528346e+10{'rings': [[[8969738.007399999, 3442925.248800...
923913172018-11-26 05:24:34LC81470392018330LGN00_MTL0.0031330863991.4153604.661195e+10{'rings': [[[8669961.7306, 3627710.4842000008]...
1023913192018-11-26 05:24:58LC81470402018330LGN00_MTL0.0106330851938.3906264.532084e+10{'rings': [[[8626659.587900002, 3442921.529799...
1123925312018-12-05 05:18:45LC81460402018339LGN00_MTL0.0005339851973.1996214.532425e+10{'rings': [[[8799698.1228, 3442772.1602], [874...
1223977332018-12-14 05:12:32LC81450402018348LGN00_MTL0.0100348851383.8576614.526085e+10{'rings': [[[8972363.073399998, 3442845.388999...
1324011232018-12-21 05:18:44LC81460402018355LGN00_MTL0.0222355851916.7412764.531834e+10{'rings': [[[8800469.5092, 3442744.7710999995]...
1424011772018-12-28 05:24:31LC81470392018362LGN00_MTL0.0019362863720.2249064.658332e+10{'rings': [[[8671754.256099999, 3627529.248599...
1524036902018-12-30 05:12:33LC81450402018364LGN00_MTL0.0086364851598.7682184.528268e+10{'rings': [[[8972212.594099998, 3442761.256499...

We can now select the image dated 02 October 2018 from the collection using its OBJECTID which is "2360748"

saharanpur_image = landsat.filter_by('OBJECTID=2360748') # 2018-10-02 
m = gis.map('Saharanpur, India')
m.add_layer(apply(saharanpur_image, 'Natural Color with DRA'))
m.add_layer(saharanpur)
m.legend = True
m

As our area of interest is larger and can not be enclosed within the tile we have chosen, let's make a mosaic of two images to enclose the AOI. Mosaicking two rasters with OBJECTID - 2360748 and 2365229 using predefined mosaic_by function.

landsat.mosaic_by(method="LockRaster",lock_rasters=[2360748,2365229])
mosaic = apply(landsat, 'Natural Color with DRA')
mosaic
<ImageryLayer url:"https://landsat2.arcgis.com/arcgis/rest/services/Landsat/MS/ImageServer">

We will clip the mosaicked raster using the boundary of Saharanpur which is our area of interest using clip function.

saharanpur_clip = clip(landsat, saharanpur_geom)
saharanpur_clip.extent = area
arcgis.env.analysis_extent = area
saharanpur_clip
<ImageryLayer url:"https://landsat2.arcgis.com/arcgis/rest/services/Landsat/MS/ImageServer">

Visualize the mosaic and training points on map

m = gis.map('Saharanpur, India')
m.add_layer(mosaic)
m.add_layer(saharanpur)
m.add_layer(data)
m.legend = True
m

Classification

With the Landsat layer and the training samples, we are now ready to train a support vector machine (SVM) model using train_classifier function.

classifier_definition = train_classifier(input_raster=saharanpur_clip, 
                                         input_training_sample_json=data.layers[0].query().to_json, 
                                         classifier_parameters={"method":"svm",
                                                                "params":{"maxSampleClass":100}},
                                         gis=gis)

Now we have the model, we are ready to apply the model to a landsat image of Saharanpur district to make prediction. This classifier will classify each pixel of the image into 5 different classes namely Urban, Forest, Agricultural, Water and Wasteland. We will use predefined classify function to perform the classification.

classified_output = classify(input_raster=saharanpur_clip, 
                             input_classifier_definition=classifier_definition)

Now let's visualize the classification result based on the colormap defined below.

cmap = colormap(classified_output.layers[0],
                 colormap=[[0, 255, 0, 0],
                          [1, 15, 242, 45],
                          [2, 255, 240, 10],
                          [3, 0, 0, 255],
                          [4, 176, 176, 167]])
map2 = gis.map('Saharanpur, India')
map2.add_layer(cmap)
map2.legend = True
map2

Here, '0', '1', '2', '3' and '4' represents pixels from 'Urban', 'Forest', 'Agricultural', 'Water' and 'Wasteland' class respectively.

We can see that our model is performing reasonably well and is able to clearly identify the urban settlements while comparing the prediction with the Landsat image.

Mask settlements out of the classified raster

We have a classified raster which has each of its pixel classified against 5 different classes. But we require the pixels belonging to Urban class only, so we will mask all the pixels belonging to that class.

classified_output = gis.content.search('Classify_AS81ZD')[0]
classified_output

As we are concerned only with detecting settlement pixels, we will mask out others from the classified raster by selecting the pixels with a class value of '0' which represents 'Urban' class.

masked = colormap(mask(classified_output.layers[0],
                       no_data_values =[],
                       included_ranges =[0,0]),
                  colormap=[[0, 255, 0, 0]],
                  astype='u8')
masked.extent = area
masked

We can save the masked raster containing the settlements to an imagery layer. This uses distributed raster analytics and performs the processing at the source pixel level and creates a new raster product.

masked = masked.save()

We can now Convert the masked pixels to polygons using 'to_features' function.

urban_item = masked.layers[0].to_features(output_name='masked_polygons' + str(datetime.now().microsecond) ,
                                          output_type='Polygon',
                                          field='Value',
                                          simplify=True,
                                          gis=gis)
urban_layer = urban_item.layers[0]
m2 = gis.map('Saharanpur, India')
m2.legend = True
m2.add_layer(urban_item)
m2

The settlement polygons have many false positives - mainly there are areas near river bed that are confused with settlements.

These false positives were removed by post processing using ArcGIS Pro. The result obtained after cleaning up the supervised classification are shown below, which are the final detected settlements.

settlement_poly = gis.content.search('settlement_polygons')[0]
settlement_poly

Deep Learning

The deep learning approach for classification needs more training data. We will use the results that we obtained after cleaning up false positives from SVM predictions above, and exported training chips against each of those polygons using ArcGIS pro to train a deep learning model.

Export training data

Export training data using 'Export Training data for deep learning' tool, detailed documentation here

  • Set Landsat imagery as 'Input Raster'.
  • Set a location where you want to export the training data, it can be an existing folder or the tool will create that for you.
  • Set the settlement training polygons as input to the 'Input Feature Class Or Classified Raster' parameter.
  • In the option 'Input Mask Polygons' we can set a mask layer to limit the tool to export training data for only those areas which have buildings in it, we created one by generating a grid of 200m by 200m on Urban Polygons layer's extent and dissolving only those polygons which contained settlements to a single multipolygon feature.
  • 'Tile Size X' & 'Tile Size Y' can be set to 224.
  • Select 'Classified Tiles' as the 'Meta Data Format' because we are training an 'Unet Model'.
  • In 'Environments' tab set an optimum 'Cell Size'. For this example, as we have performing the analysis on Landsat8 imagery, we used 30 cell size which meant 30m on a project coordinate system.

Prepare data

We would specify the path to our training data and a few hyper parameters.

  • path: path of folder containing training data.
  • batch_size: No of images your model will train on each step inside an epoch, it directly depends on the memory of your graphic card. 8 worked for us on a 11GB GPU.
  • imagery_type: It is a mandatory input to enable model for multispectral data processing. It can be "landsat8", "sentinel2", "naip", "ms" or "multispectral".
import os
from pathlib import Path

gis = GIS('home')
training_data = gis.content.get('b9602a576cd54bc6a8e0279a290537af')
training_data
detecting_settlements_using_supervised_classification_and_deep_learning
Image Collection by api_data_owner
Last Modified: September 03, 2020
0 comments, 1 views
filepath = training_data.download(file_name=training_data.name)
import zipfile
with zipfile.ZipFile(filepath, 'r') as zip_ref:
    zip_ref.extractall(Path(filepath).parent)
data_path = Path(os.path.join(filepath.split('.')[0]))
# Prepare Data
data = prepare_data(data_path,
                    batch_size=8,
                    imagery_type='landsat8')

For our model, we want out of the 11 bands in Landsat 8 image, weights for the RGB bands to be initialized using ImageNet weights whereas the other bands should have their weights initialized using the red band values. We can do this by setting the "type_init" parameter in the environment as "red band".

arcgis.env.type_init = 'red_band'

Visualize training data

The code below shows a few samples of our data with the same symbology as in ArcGIS Pro.

  • rows: No. of rows we want to see the results for.
  • alpha: controls the opacity of labels (Classified imagery) over the background imagery
  • statistics_type: for dynamic range adjustment
data.show_batch(rows=2, alpha=0.7, statistics_type='DRA')
<Figure size 1080x720 with 6 Axes>

Here, the light blue patches show the labelled settlements.

Load model architecture

We will be using U-net [4], one of the well recognized image segmentation algorithm, for our land cover classification. U-Net is designed like an auto-encoder. It has an encoding path (“contracting”) paired with a decoding path (“expanding”) which gives it the “U” shape. However, in contrast to the autoencoder, U-Net predicts a pixelwise segmentation map of the input image rather than classifying the input image as a whole. For each pixel in the original image, it asks the question: “To which class does this pixel belong?”. U-Net passes the feature maps from each level of the contracting path over to the analogous level in the expanding path. These are similar to residual connections in a ResNet type model, and allow the classifier to consider features at various scales and complexities to make its decision.

# Create Unet Model
model = UnetClassifier(data)

Train the model

By default, the U-Net model uses ImageNet weights which are trained on 3 band imagery but we are using the multispectral support of ArcGIS API for Python to train the model for detecting settlements on 11 band Landsat 8 imagery, so unfreezing will make the backbone trainable so that it can learn to extract features from multispectral data.

Before training the model, let's unfreeze it to train it from scratch.

model.unfreeze()

Learning rate is one of the most important hyperparameters in model training. Here, we explore a range of learning rate to guide us to choose the best one.

# Find Learning Rate
model.lr_find()
<Figure size 432x288 with 1 Axes>
0.0002754228703338166

Based on the learning rate plot above, we can see that the learning rate suggested by lr_find() for our training data is 0.0002754228703338166. We can use it to train our model.

We would fit the model for 100 epochs. One epoch mean the model will see the complete training set once.

# Training
model.fit(epochs=100, lr=0.00022908676527677726);
epochtrain_lossvalid_lossaccuracytime
01.3725140.5018060.92385300:17
10.7857020.2427980.92478600:14
20.5059670.1736970.95414900:14
30.3658350.1703690.96038000:15
40.2734820.1568400.96127900:15
50.2170280.1216260.95921500:15
60.1705340.0967310.96884400:15
70.1406860.0905050.97298400:17
80.1237340.0896140.96622200:17
90.1093040.0880850.96615700:17
100.1059200.0736590.97594500:17
110.0935540.0662350.97434300:17
120.0850370.0657090.97359600:17
130.0896340.0918280.97374600:17
140.0932210.0945440.96769700:17
150.0998590.0935860.96551800:18
160.0916740.0567630.98125600:16
170.0873820.0584760.97768600:16
180.0809620.0513070.98128500:16
190.0762810.0587400.97665100:16
200.0719870.0510910.98084900:17
210.0763980.0557130.97818800:16
220.0734600.0542500.98125300:16
230.0737230.0611810.97603700:16
240.0708140.0501840.98220800:16
250.0684050.0507810.98230400:16
260.0686760.0526280.98051500:17
270.0684460.0593680.97980100:16
280.0667320.0524060.98160700:16
290.0693640.0575140.98175000:17
300.0692800.0943940.96849300:16
310.0719620.0703430.97334400:16
320.0679150.0538730.98142700:16
330.0680140.0690220.97878300:17
340.0701440.0536390.98015900:17
350.0686990.0502120.98187000:17
360.0629600.0487930.98204200:16
370.0597890.0486750.98147000:17
380.0670270.0545670.97993600:17
390.0671930.0497690.98248900:17
400.0652550.0531060.97978700:17
410.0641920.0594770.97726800:18
420.0635010.0543790.98135100:18
430.0619190.0529280.98094500:17
440.0583380.0490310.98140700:17
450.0580450.0503860.98306000:16
460.0591590.0509750.98099200:16
470.0579460.0496330.98147500:17
480.0556350.0461480.98293500:17
490.0548050.0465910.98251200:17
500.0553990.0486540.98292400:18
510.0610260.0497130.98298900:17
520.0603420.0603450.98241900:17
530.0579110.0473780.98225400:17
540.0572080.0452880.98334600:17
550.0566030.0531260.98021200:16
560.0554580.0470080.98335200:16
570.0550160.0475030.98346600:16
580.0558640.0462390.98354500:16
590.0536600.0442200.98362500:16
600.0538310.0505460.98032100:16
610.0535390.0464880.98356700:16
620.0523270.0445330.98385400:16
630.0521480.0459450.98277000:16
640.0532290.0450450.98330200:17
650.0548150.0505800.98126100:17
660.0548480.0498360.98100400:17
670.0537700.0455600.98332900:17
680.0510000.0440230.98356700:17
690.0506620.0445350.98367900:16
700.0508170.0433980.98390300:17
710.0504610.0429220.98402400:17
720.0514080.0448300.98366800:17
730.0499050.0476970.98220400:16
740.0501020.0455140.98353800:17
750.0499530.0472180.98230700:16
760.0494590.0453150.98330700:16
770.0489640.0437590.98387800:16
780.0486360.0447200.98342500:16
790.0478180.0449030.98335200:16
800.0477090.0455420.98300300:16
810.0489660.0450460.98337400:17
820.0475630.0463670.98253700:17
830.0482920.0440360.98385700:16
840.0471850.0442300.98384900:16
850.0459700.0439590.98344800:17
860.0472500.0434830.98400200:17
870.0474000.0440480.98378400:17
880.0470040.0437710.98381900:17
890.0489610.0444200.98345400:17
900.0489940.0441710.98350600:17
910.0486430.0442210.98358000:17
920.0480480.0442520.98366100:16
930.0460490.0439510.98367600:17
940.0465170.0437980.98375500:17
950.0455870.0437570.98371700:16
960.0470890.0436930.98374200:17
970.0458980.0437100.98384000:17
980.0447070.0437010.98382700:16
990.0449150.0438230.98385900:17
model.mIOU()
100.00% [3/3 00:23<00:00]
{'0': 0.9821453002296315, '1': 0.7511354937467614}

Here, we can see loss on both training and validation set decreasing, which shows how well the model generalizes on unseen data thereby preventing overfitting. Also, as we can see our model here is able to classify pixels to settlement or non settlement with an accuracy 0.98. Once we are satisfied with the accuracy, we can save the model.

Save the model

We would now save the model which we just trained as a 'Deep Learning Package' or '.dlpk' format. Deep Learning package is the standard format used to deploy deep learning models on the ArcGIS platform. UnetClassifier models can be deployed by the tool 'Classify Pixels Using Deep Learning' available in ArcGIS Pro as well as ArcGIS Enterprise. For this sample, we will be using this model in ArcGIS Pro to extract settlements.

We will use the save() method to save the model and by default it will be saved to a folder 'models' inside our training data folder itself.

model.save('100e')

Load an intermediate model to train it further

If we need to further train an already saved model we can load it again using the code below and repeat the earlier mentioned steps of training.

# Load Model from previous saved files 
# model.load('100e')

Preview results

The code below will pick a few random samples and show us ground truth and respective model predictions side by side. This allows us to validate the results of your model in the notebook itself. Once satisfied we can save the model and use it further in our workflow.

model.show_results(rows=2)
<Figure size 720x720 with 4 Axes>

Deploy model and extract settlements

The model saved in previous step can be used to extract classified raster using 'Classify Pixels Using Deep Learning' tool. Further the classified raster is regularised and finally converted to a vector Polygon layer. The regularisation step uses advanced ArcGIS geoprocessing tools to remove unwanted artifacts in the output.

Generate a classified raster

In this step, we will generate a classified raster using 'Classify Pixels Using Deep Learning' tool available in both ArcGIS Pro and ArcGIS Enterprise.

  • Input Raster: The raster layer from which we want to extract settlements from.
  • Model Definition: It will be located inside the saved model in 'models' folder in '.emd' format.
  • Padding: The 'Input Raster' is tiled and the deep learning model classifies each individual tile separately before producing the final 'Output Classified Raster'. This may lead to unwanted artifacts along the edges of each tile as the model has little context to predict accurately. Padding as the name suggests allows us to supply some extra information along the tile edges, this helps the model to predict better.
  • Cell Size: Should be close to at which we trained the model, we specified that at the Export training data step .
  • Processor Type: This allows to control whether the system's 'GPU' or 'CPU' would be used in to classify pixels, by 'default GPU' will be used if available.
  • Parallel Processing Factor: This allows us to scale this tool, this tool can be scaled on both 'CPU' and 'GPU'. It specifies that the operation would be spread across how many 'cpu cores' in case of cpu based operation or 'no of GPU's' in case of GPU based operation.

Output of this tool will be in form of a 'classified raster' containing both background and Settlements. The pixels having background class are shown in green color and pixels belonging to settlement class are in purple color.

                                        A subset of preditions by our Model

Change the symbology of the raster by setting the second class of the classified pixels to 'No color'.

After this step, we will get a masked raster which will look like the screenshot below.

We can now run 'Raster to Polygon' tool in ArcGIS Pro to convert this masked raster to polygons.

Results obtained from Deep Learning approach are ready to be analyzed.

             Deep Learning Results

Conclusion

This notebook shows how traditional algorithms like Support Vector Machines can be used to generate the necessary quality data for Deep Learning models which after cleaning can serve as the training data required by a DL model for generating accurate results at a much larger extent.

References

[1] https://developers.arcgis.com/python/sample-notebooks/detecting-swimming-pools-using-satellite-image-and-deep-learning/

[2] https://developers.arcgis.com/python/sample-notebooks/extracting-building-footprints-from-drone-data/

[3] https://developers.arcgis.com/python/sample-notebooks/land-cover-classification-using-unet/

[4] Olaf Ronneberger, Philipp Fischer, Thomas Brox: U-Net: Convolutional Networks for Biomedical Image Segmentation, 2015; arXiv:1505.04597.

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.