Landcover mapping using hyperspectral imagery and deep learning

Introduction

Generally, multispectral imagery is preferred for Landuse Landcover (LULC) classification, due to its high temporal resolution and high spatial coverage. With the advances in remote sensing technologies, hyperspectral images are now also a good option for LULC classification, due to their high spectral resolution. The main difference between multispectral and hyperspectral imagery, is the number of bands and how narrow those bands are. One of the advantages hyperspectral sensors have over multispectral sensors is the ability to differentiate within classes. For instance, due to the high spectral information content that creates a unique spectral curve for each class, hyperspectral sensors can distinguish by tree or crop species.

Multispectral imagery

Hyperspectral imagery

In this notebook, we will use hyperspectral data to train a deep learning model and will see if the model can extract subclasses of two LULC classes: developed areas and forests. Hyperion imagery is used in the current analysis to classify the types of forests and developed areas. The data can be downloaded from USGS earth explorer.

Hyperion data preparation

The Earth Observing-1 (EO-1) satellite was launched November 21, 2000 as a one-year technology demonstration/validation mission. After the initial technology mission was completed, NASA and the USGS agreed to the continuation of the EO-1 program as an Extended Mission. The EO-1 Extended Mission is chartered to collect and distribute Hyperion hyperspectral and Advanced Land Imager (ALI) multispectral products according to customer tasking requests.

Hyperion collects 220 unique spectral channels ranging from 0.357 to 2.576 micrometers with a 10-nm bandwidth. The instrument operates in a pushbroom fashion, with a spatial resolution of 30 meters for all bands. The standard scene width is 7.7 kilometers.

Download hyperion imagery

Login to the earth explorer using the USGS credentials. Then, select the Address/Place option in the Geocoding Method dropdown and write the name or address of the area of interest.

Draw a polygon over the area of interest and change the Cloud Cover Range to 0% - 10%. Click on Results in the bottom left.

Next, expand EO-1, select EO-1 Hyperion, and click on Results.

Download the data from the product list.

Pre processing of hyperion imagery

Remove bad bands

Not all bands are useful for analysis. Bad bands are primarily water vapor bands consists inforation about atmosphere, that cause spikes in the reflectance curve. List of bad bands of the Hyperion sensor, L1R product are: 1 to 7 - Not illuminated

58 to 78 - Overlap region

120 to 132 - Water vapor absorption band

165 to 182 - Water vapor absorption band

185 to 187 - Identified by Hyperion bad band list

221 to 224 - Water vapor absorption band

225 to 242 - Not illuminated

Bands with vertical stripping - (8-9, 56-57, 78-82, 97-99, 133-134, 152-153, 188, 213-216, 219-220)

Create composite rasters

Two sets of composite rasters were created using Composite bands function. The band range of two raster composites are as follows: Two sets of composite rasters were created using the Composite bands function function. The band range of the two raster composites are as follows:

NIR & visible composite - 9 to 55

SWIR composite - 82-96, 100-119, 135-164, 183-184, 189-212, 217-218

Click on Imagery tab -> click on Raster Functions -> Search "Composite Bands" -> Click on browse button and select the bands -> After selecting the bands, click on Create new layer.

Convert DN values to at-sensor radiance

Scale factors will be used to convert the pixel DN value to at-sensor radiance. A scale factor of 40 will be used for the VNIR composite raster and a scale factor of 80 will be used for the SWIR composite. The rasters will be divided by the scale factor using the Raster Calculator tool in ArcGIS Pro.

Create combined composite raster

Combine both the VNIR and SWIR radiance composite rasters using the "Composite Band" function.

Eliminate NoData pixels

Create null raster

The output raster has a 0 pixel value for NoData that is shown as a black pixel. The NoData pixels can be eliminated using the Set Null tool.

Use the following parameters:

Raster: combined_composite

False raster = 0

Cellsize Type = Max Of

Extent Type = Intersection Of

Clip

The Clip tool will be used to eliminate instances of NoData from the combined composite raster using the null raster that was created in the previous step.

The input parameters are as follows:

Raster: combined_composite

Clipping Type: Outside

Clipping Geometry/Raster: null_raster

Principal Component Analysis

Principle component analysis can be used to determine the comparatively higher information bands of Hyperion's 224 total bands. Principal Components performs Principal Component Analysis (PCA) on a set of raster bands and generates a single multiband raster as its output. The value specified for the number of principal components determines the number of principal component bands in the output multiband raster. The number must not be larger than the total number of raster bands in the input.

The input parameters are as follows:

Input raster bands: Clip_combined_composite

Number of Principal components: 12 (12 Principal component were chosen because they cover more than 98% of the total variance)

The output raster will have 12 bands representing 12 Principal components.

Export training data for deep learning model

from arcgis.gis import GIS
gis = GIS('home')
ent_gis = GIS('https://pythonapi.playground.esri.com/portal', 'arcgis_python', 'amazing_arcgis_123')

The raster generated from the Hyperion imagery through the Principal Components tool will be used as the input raster for the training data.

pca_raster = ent_gis.content.get('1bb227a836094127b530cf8db3a7fa4d')
pca_raster
input_pca_raster_hyperion
input_pca_raster_hyperionImagery Layer by api_data_owner
Last Modified: March 21, 2021
0 comments, 0 views

The following feature layer will be used as label for training the model.

lulc_polygons = ent_gis.content.get('f69011a01c39494bade9b30708bce064')
lulc_polygons
label_hyperion
lulc_hyperionFeature Layer Collection by api_data_owner
Last Modified: March 20, 2021
0 comments, 6 views
m = gis.map('USA', 4)
m.add_layer(lulc_polygons)
m.legend=True
m

The above layer consists of 7 classes. Developed areas have three sub classes: High, Medium, and Low density areas and Forested areas have 3 types of forests: Evergreen, Mixed, and Deciduous forest. No DATA represents all other LULC classes.

m.zoom_to_layer(lulc_polygons)

A feature layer will be used as a input mask polygon while exporting the training data to define the extent of the study area.

mask_poly = ent_gis.content.get('bfe7bb7ea8454d9384cb8005a5c0f855')
mask_poly
mask_polygon_hyperion
mask_polygon_hyperionFeature Layer Collection by api_data_owner
Last Modified: March 22, 2021
0 comments, 0 views

The Export training data for deep learning tool is used to prepare training data for training a deep learning model. The tool is available in both ArcGIS Pro and ArcGIS Enterprise.

Model training

This step will utilize Jupyter Notebooks, and documentation on how to install and setup the environment is available here.

Necessary Imports

import os
from pathlib import Path

import arcgis
from arcgis.learn import prepare_data, UnetClassifier

Get training data

We have already exported the data, and it can be directly downloaded using the following steps:

training_data = gis.content.get('0a8ff4ce9f734bb5bb72104aea8526af')
training_data
landcover_classification_using_hyperspectral_imagery_and_deep_learning
Image Collection by api_data_owner
Last Modified: March 22, 2021
0 comments, 2 views
filepath = training_data.download(file_name=training_data.name)
import zipfile
with zipfile.ZipFile(filepath, 'r') as zip_ref:
    zip_ref.extractall(Path(filepath).parent)
data_path = Path(os.path.join(os.path.splitext(filepath)[0]))

Visualize training data

The prepare_data function takes a training data path as input and creates a fast.ai databunch with specified transformation, batch size, split percentage, etc.

data = prepare_data(data_path,  
                    batch_size=8)

To get a sense of what the training data looks like, use the show_batch() method to randomly pick a few training chips and visualize them. The chips are overlaid with masks representing the building footprints in each image chip.

data.show_batch(alpha=1)
<Figure size 1080x1080 with 9 Axes>

Load model architecture

arcgis.learn provides the UnetClassifier model for per pixel classification that is based on a pretrained convnet, like ResNet, that acts as the 'backbone'. More details about UnetClassifier can be found here.

model = UnetClassifier(data, pointrend=True)

Train the model

Learning rate is one of the most important hyperparameters in model training. We will use the lr_find() method to find an optimum learning rate at which we can train a robust model.

lr = model.lr_find()
<Figure size 432x288 with 1 Axes>

We are using the suggested learning rate above to train the model for 2000 epochs.

model.fit(100, lr=lr)
epochtrain_lossvalid_lossaccuracydicetime
04.8958832.3040680.1841920.13358600:08
14.6828442.2287300.2043550.13452000:08
24.2474152.2475100.2240690.16423500:07
34.0542152.2619380.2278530.20822100:07
43.8381502.2078420.2284000.14493900:07
53.5805662.1800770.2252530.12852200:07
63.3360412.1506460.2072540.17560100:07
73.1207622.0732250.2140260.15247900:07
82.9330012.0459550.2413020.15395100:07
92.7894672.0842060.1982240.17318300:07
102.6476191.9347950.2449130.17419800:07
112.5120811.9075200.2552220.15907400:07
122.3923071.8230570.2756070.21358900:07
132.2854321.7657140.3034300.19412400:07
142.1886991.7351600.3080350.24223200:07
152.0962171.6964310.3201360.23995000:07
162.0174431.6294670.3357060.28912100:07
171.9492371.5872580.3538760.29031600:07
181.8762621.5385740.3720550.33038400:07
191.8153271.5277170.3792760.33417300:07
201.7608071.6037840.3636230.30792300:07
211.7125511.4648100.3979460.32778600:07
221.6607181.4201460.4195620.37397200:07
231.6199521.4076240.4198460.35885100:07
241.5781081.3639350.4365510.39646000:07
251.5467201.3823130.4273620.35087200:07
261.5142281.3431620.4462620.40684900:07
271.4937861.3493200.4375240.35524200:07
281.4681801.3180900.4526310.40316800:07
291.4398741.2933410.4607480.40732100:08
301.4126431.3121100.4565060.39498400:07
311.3936591.2986820.4611760.41346200:07
321.3743511.2850260.4679140.43353100:07
331.3688961.2715240.4724730.41157000:07
341.3490911.2571200.4773350.42072800:08
351.3345181.2374780.4824650.43486500:07
361.3222551.2299850.4893800.44574100:07
371.3099351.2188110.4913390.44968500:07
381.2983641.2410760.4857030.44083000:07
391.2845441.2060090.5017670.46745400:07
401.2718941.1951290.4995030.46573600:08
411.2582101.2042290.4963410.44716800:07
421.2491081.1880090.5024290.48213500:07
431.2452221.2058360.4953980.45903300:07
441.2369191.1660540.5138000.48297100:07
451.2285891.1726570.5086760.45867600:07
461.2209881.1832260.5013240.43841000:07
471.2121981.1900250.5043030.45503400:07
481.2095251.1821040.5074950.45777600:07
491.2059351.1612460.5140660.46423600:07
501.2042631.1523660.5164090.47513000:07
511.1968221.1762710.5082550.48115300:07
521.1880361.1460060.5210110.48802400:07
531.1830941.1533770.5189540.48939400:07
541.1788071.1350870.5257200.49263600:07
551.1730151.1442350.5233400.49662800:07
561.1686501.1387960.5209630.46702000:07
571.1669941.1486500.5214360.47723600:07
581.1689301.1267330.5286900.48640300:07
591.1617231.1127720.5334660.50493500:08
601.1575611.1195200.5296600.49130200:07
611.1588541.1398320.5192750.48699700:07
621.1551921.1132810.5335110.49338300:07
631.1539431.1218550.5292180.48302700:07
641.1485641.1492270.5179230.47506900:07
651.1487681.1126610.5332610.48788800:07
661.1456261.1164980.5304660.48792900:07
671.1427951.1063500.5350920.49187500:07
681.1420051.1091010.5335450.50202000:07
691.1354791.1111730.5348880.49134600:07
701.1355441.1046360.5341190.49707300:07
711.1324661.1028350.5373230.49698400:07
721.1319761.0978900.5389280.50321000:07
731.1275401.0992650.5377230.50005400:07
741.1206061.0932850.5404170.50330200:07
751.1150771.0971110.5402370.50481400:07
761.1143711.0927640.5402070.50501900:07
771.1158871.0942830.5400090.50428500:08
781.1164681.0992670.5382020.50250700:07
791.1141791.0884990.5435670.51422400:07
801.1123371.0907170.5401670.50408100:07
811.1118181.0922200.5397130.49884800:07
821.1095631.0865340.5437230.50809600:07
831.1093171.0868580.5437410.50898100:07
841.1098091.0906420.5408660.49954100:07
851.1059211.0851040.5431760.50775400:07
861.1058181.0853290.5436220.51030700:07
871.1003951.0885070.5424530.50848200:07
881.1019711.0831160.5450290.51191500:07
891.0998251.0828750.5452210.51210500:07
901.0961181.0837990.5450130.51235500:07
911.0978381.0836070.5444340.51041500:07
921.0943321.0825710.5448550.51052600:07
931.0948721.0849030.5435060.50631500:07
941.0951931.0839020.5442080.50785200:07
951.0940481.0851750.5434020.50576100:07
961.0961761.0843590.5439300.50671200:07
971.0975731.0848820.5436550.50804300:07
981.0974151.0852780.5434360.50710300:07
991.0973191.0841940.5437990.50657800:07

We have trained the model for a further 1900 epochs to improve model performance. For the sake of time, the cell below is commented out.

# model.fit(1900)

Visualize classification results in validation set

It's a good practice to see results of the model vis-a-vis ground truth. The code below picks random samples and shows us ground truths and model predictions, side by side. This enables us to preview the results of the model within the notebook.

model.show_results()
<Figure size 720x1800 with 10 Axes>

Evaluate model performance

As we have 7 classes for this segmentation task, we need to perform an accuracy assessment for each class. To achieve this, ArcGIS API for Python provides the per_class_metrics function that calculates a precision, recall, and f1 score for each class.

model.per_class_metrics()
NoDataDeciduous ForestDeveloped, High IntensityDeveloped, Low IntensityDeveloped, Medium IntensityEvergreen ForestMixed Forest
precision0.6894040.7248230.7721280.6373620.6373220.7309500.600000
recall0.7389980.7409480.7829280.6056650.6458930.3844630.415889
f10.7133400.7327970.7774900.6211090.6415790.5038910.491261

Save model

We will save the model that we trained as a 'Deep Learning Package' ('.dlpk' format). A Deep Learning Package is the standard format used to deploy deep learning models on the ArcGIS platform.

We will use the save() method to save the trained model. By default, it will be saved to the 'models' sub-folder within our training data folder.

model.save('unet_2000e', publish=True, gis=gis)
Computing model metrics...
WindowsPath('C:/data/2021/hyperspectral/hyperion_l8lulc_128px_64strd_30px_mask_frst_bltp_7classes/models/unet_2000e')

Model inference

Using ArcGIS Pro, we can use the trained model on a test image/area to classify different types of built-up areas and forests in the hyperspectral satellite image.

After we have trained the UnetClassifier model and saved the weights for classifying images, we can use the Classify Pixels Using Deep Learning tool, available in both ArcGIS pro and ArcGIS Enterprise, for inferencing at scale.

inference_raster = ent_gis.content.get('df7781fe3d904afa914ef8804b08c802')
inference_raster
pca_raster_for__inferencing
pca_raster_for_inferencingImagery Layer by api_data_owner
Last Modified: March 20, 2021
0 comments, 1 views

with arcpy.EnvManager(processorType="GPU"): out_classified_raster = arcpy.ia.ClassifyPixelsUsingDeepLearning("pca_raster_for_inferencing", r"C:\path\to\model.dlpk", "padding 32;batch_size 2;predict_background True; tile_size 128", "PROCESS_AS_MOSAICKED_IMAGE", None); out_classified_raster.save(r"C:\sample\sample.gdb\inferenced_results")

Results visualization

The classified output raster is generated using ArcGIS Pro. The output raster is published on the portal for visualization.

inferenced_results = ent_gis.content.get('29a95588665245359d3787e48fc501cc')
inferenced_results
inferenced_results_
inferenced_resultsMap Image Layer by api_data_owner
Last Modified: March 20, 2021
0 comments, 0 views
rgb_imagery = ent_gis.content.get('94d505a8fd084e7f896ba46247b12739')
rgb_imagery
fcc_imagery_l8
fcc_imagery_l8Imagery Layer by api_data_owner
Last Modified: March 20, 2021
0 comments, 2 views

Create map widgets

Two map widgets are created showing the RGB imagery and Inferenced results.

m1 = gis.map()
m1.add_layer(rgb_imagery)
#m1.basemap = 'satellite'
m2 = gis.map()
m2.add_layer(inferenced_results)

Synchronize web maps

The maps are synchronized with each other using MapView.sync_navigation functionality. Map view synchronization helps in comparing the inferenced results with the RGB imagery. A detailed description about advanced map widget options can be referenced here.

m1.sync_navigation(m2)

Set the map layout

from ipywidgets import HBox, VBox, Label, Layout

Hbox and Vbox were used to set the layout of map widgets.

hbox_layout = Layout()
hbox_layout.justify_content = 'space-around'

hb1=HBox([Label('FCC imagery'),Label('Results')])
hb1.layout=hbox_layout

Results

The resulting predictions are provided as a map for better visualization.

VBox([hb1,HBox([m1,m2])])

Inferenced results legend

m2.zoom_to_layer(inferenced_results)

In the map widgets, it can be seen that the model is able to differentiate between forests and open spaces (gardens, parks, etc). The different types of developed areas are also accurately classified by the model, with the bright blue-gray patches representing high density developed areas, for instance. The forested areas have also been accurately classified by the model.

Limitation

One of the major limitations in working with hyperspectral data is its lack of availability for many parts of the world. In comparison to multispectral imagery tiles, like those from Landsat-8, one tile of hyperion imagery covers a relatively small area. If a training dataset is prepared with multiple tiles, a model can be trained and more accurate results can be generated.

Conclusion

In this notebook, we have covered a lot of ground. In Part 1, we covered how Hyperion data can be downloaded using USGS Earth Explorer, the pre-processing steps of Hyperion, how to create a Principal Component raster, and how to export training data for deep learning using ArcGIS Pro. In Part 2, we demonstrated the steps to prepare the input data, train a pixel-based classification model, visualize the results, and generate the accuracy metrics. Finally, in Part 3, we demonstrated the process of inferencing results on a test raster/area. The same workflow can be used to identify minerals/rocks, plant species, oil spills, etc. using hyperspectral imageries and arcgis.learn models.

References

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.