Detecting settlements using supervised classification and deep learning¶
- 🔬 Data Science
- 🥠 Deep Learning and pixel-based classification
Introduction¶
Deep-Learning methods tend to perform well with high amounts of data as compared to machine learning methods which is one of the drawbacks of these models. These methods have also been used in geospatial domain to detect objects [1,2] and land use classification [3] which showed favourable results, but labelled input satellite data has always been an effortful task. In that regards, in this notebook we have attempted to use the supervised classification approach to generate the required volumes of data which after cleaning was used to come through the requirement of larger training data for Deep Learning model.
For this, we have considered detecting settlements for Saharanpur district in Uttar Pradesh, India. Settlements have their own importance to urban planners and monitoring them temporally can lay the foundation to design urban policies for any government.
Necessary imports¶
%matplotlib inline
import pandas as pd
from datetime import datetime
import matplotlib.pyplot as plt
import arcgis
from arcgis.gis import GIS
from arcgis.geocoding import geocode
from arcgis.learn import prepare_data, UnetClassifier
from arcgis.raster.functions import clip, apply, extract_band, colormap, mask, stretch
from arcgis.raster.analytics import train_classifier, classify, list_datastore_content
Connect to your GIS¶
gis = GIS(url='https://pythonapi.playground.esri.com/portal', username='arcgis_python', password='amazing_arcgis_123')
Get the data for analysis¶
Search for Multispectral Landsat layer in ArcGIS Online. We can search for content shared by users outside our organization by setting outside_org to True.
landsat_item = gis.content.search('title:Multispectral Landsat', 'Imagery Layer', outside_org=True)[0]
landsat = landsat_item.layers[0]
landsat_item
Search for India State Boundaries 2018 layer in ArcGIS Online. This layer has all the District boundaries for India at index - 2.
boundaries = gis.content.search('India District Boundaries 2018', 'Feature Layer', outside_org=True)[2]
boundaries
district_boundaries = boundaries.layers[2]
district_boundaries
m = gis.map('Saharanpur, India')
m.add_layer(district_boundaries)
m.legend = True
As this notebook is to detect settlements for Saharanpur district, you can filter the boundary for Saharanpur.
area = geocode("Saharanpur, India", out_sr=landsat.properties.spatialReference)[0]
landsat.extent = area['extent']
We want to detect settlements for Saharanpur district, so we will apply a query to the boundary layer, by setting "OBJECTID = 09132". The code below brings imagery and feature layer on the same extent.
saharanpur = district_boundaries.query(where='ID=09132') # query for Saharanur district boundary
saharanpur_geom = saharanpur.features[0].geometry # Extracting geometry of Saharanpur district boundary
saharanpur_geom['spatialReference'] = {'wkid':4326} # Set the Spatial Reference
saharanpur.features[0].extent = area['extent'] # Set the extent
Get the training points for training the classifier. These are 212 points in total marked against 5 different classes namely Urban, Forest, Agricultural, Water and Wasteland. These points are marked using ArcGIS pro and pulished on the gis server.
data = gis.content.search('classificationPointsSaharanpur', 'Feature Layer')[0]
data
Filter satellite Imagery based on cloud cover and time duration¶
In order to produce good results, it is important to select cloud free imagery from the image collection for a specified time duration. In this example we have selected all the images captured between 1 October 2018 to 31 December 2018 with cloud cover less than or equal to 5% for Saharanpur region.
selected = landsat.filter_by(where="(Category = 1) AND (cloudcover <=0.05)",
time=[datetime(2018, 10, 1),
datetime(2018, 12, 31)],
geometry=arcgis.geometry.filters.intersects(area))
df = selected.query(out_fields="AcquisitionDate, GroupName, CloudCover, DayOfYear",
order_by_fields="AcquisitionDate").sdf
df['AcquisitionDate'] = pd.to_datetime(df['AcquisitionDate'], unit='ms')
df
We can now select the image dated 02 October 2018 from the collection using its OBJECTID which is "2360748"
saharanpur_image = landsat.filter_by('OBJECTID=2360748') # 2018-10-02
m = gis.map('Saharanpur, India')
m.add_layer(apply(saharanpur_image, 'Natural Color with DRA'))
m.add_layer(saharanpur)
m.legend = True
m