How AutoDL Works

Introduction

This guide explains the steps for the training and evaluation of multiple network architectures supported by the arcgis.learn API. The arcgis.learn API currently supports more than 30 deep learning networks for object detection. The API provides 4 deep learning networks, along with the MMDetection class, which supports more than twenty object detection networks. Similarly, for pixel classification, the API supports 11 deep learning networks, along with the MMSegmentation class, which supports more than twenty pixel classification networks.

To train a deep learning network using the arcgis.learn API, you must follow the complete pipeline, which involves data preprocessing, network selection, hyper parameter tuning, and network selection/evaluation based on the performance of the network.

As it can be difficult to iteratively run and compare all of the different networks, the AutoDL class automatically trains all of the supported networks with the given data within a provided time limit and provides a performance tally for all of the networks. The AutoDL class will also save all of the networks during the training process, allowing them to be used later for fine tuning to enhance the network performance.

AutoDL supported Networks

Object Detection

SingleShotDetector
RetinaNet
FasterRCNN
YOLOv3
ATSS
CARAFE
CascadeRCNN
CascadeRPN
DCN
Detectors
DoubleHeads
DynamicRCNN
EmpiricalAttention
FCOS
FoveaBox
FSAF
GHM
LibraRCNN
PaFPN
PISA
RegNet
RepPoints
Res2Net
SABL
VFNet

Pixel Classification

DeepLab
UnetClassifier
PSPNetClassifier
ANN
APCNet
CCNet
CGNet
HRNet
DeepLabV3Plus
DMNet
DNLNet
EMANet
FastSCNN
FCN
GCNet
MobileNetV2
NonLocalNet
OCRNet
PSANet
SemFPN
UperNet

Implementation in `arcgis.learn`

Let's see how AutoDL class works with arcgis.learn

Imports

from arcgis.learn import AutoDL, prepare_data, ImageryModel

Prepare data

Prepare the data for AutoDL class using prepare_data() in arcgis.learn, the recommended value for the batch_size parameter is None as AutoDL class supports automatic evaluation of the batch_size based on the GPU capacity.

data = prepare_data("path_to_data_folder", batch_size=None)

Train networks using AutoDL

AutoDL class accepts the following paramters:

data (Required Parameter): The data object returned from the prepare_data function in the previous step.
total_time_limit (Optional parameter): The total time limit in hours for the AutoDL to train and evaluate the networks. This parameter becomes important when time is the main constraint to the user. The AutoDL class calculates the number of chips that can be processed in the given time from the prepared databunch.
mode (Optional Parameter): Can be "basic" or "advanced".
- basic : To to be used when the user wants to train all selected networks.
- advanced : To be used when the user also wants to tune hyper parameters of the two best performing models from the basic mode.

network (Optional Parameter): The list of models that will be used in the training process. If the user does not provide the parameter value, the AutoDL class selects all of the supported networks, however the user can select one or more networks by passing the network names as string in a list.
- Supported Object Detection models
  - SingleShotDetector, RetinaNet, FasterRCNN, YOLOv3, ATSS, CARAFE, CascadeRCNN, CascadeRPN, DCN, Detectors, DoubleHeads, DynamicRCNN, EmpiricalAttention, FCOS, FoveaBox, FSAF, GHM, LibraRCNN, PaFPN, PISA, RegNet, RepPoints, Res2Net, SABL, VFNet
- Supported Object Detection models
  - DeepLab, UnetClassifier, PSPNetClassifier, ANN, APCNet, CCNet, CGNet, HRNet, DeepLabV3Plus, DMNet, DNLNet, EMANet, FastSCNN, FCN, GCNet, MobileNetV2, NonLocalNet, OCRNet, PSANet, SemFPN, UperNet

verbose (Optional Parameter): Optional Boolean. To be used to display logs while training the networks. This parameter displays the progress with time and becomes important in case of any failure, user can use the logs to check which, when and why network training failed.

AutoDL Training modes

Basic
- In this mode we iterate through all of the supported networks once with the default backbone, train it with the passed data, and calculate the network performance. At the end of each iteration, the function will save the model to the disk. Based on the alloted time, the program will automatically calculate the maximum number of epochs to train each network. However,the training will stop if the model stops improving for 5 epochs. A minimum difference of 0.001 in the validation loss is required for it to be considered as an improvement.
Advanced
- To be used when the user wants to tune the hyper-parameters of two best performing networks from the basic mode. This mode will divide the total time into two halves. In the first half, it works like basic mode, where it will iterate through all of the supported networks once. In the second half, it checks for the two best performing networks. The program then trains the selected networks with different supported backbones. At the end of each iteration, the function will save the model to the disk. Based on the alloted time, the program will automatically calculate the maximum number of epochs to train each network. However,the training will stop if the model stops improving for 5 epochs. A minimum difference of 0.001 in the validation loss is required for it to be considered as an improvement.
- In this mode we use optuna to tune the hyper-paramaeters of the network.

dl = AutoDL(data, total_time_limit=5,verbose=True, mode="advanced")

When the AutoDL class is initialized, it calculates the number of images that can be processed in the given time and the time required to process the all of the data. The output of the cell above can then be used to analyze and update the total_time_limit and networks parameters while initializing the class.

Here is an example of the output

Given time to process the dataset is: 5.0 hours
Number of images that can be processed in the given time: 290
Time required to process the entire dataset of 3000 images is 52 hours

This explains how many images can be processed to train all of the selected networks in the selected mode within the given time, as well as it provides an estimate of the time that AutoDL will take to train all of the selected networks with the entire dataset.

Supported methods in AutoDL

Supported Classification Models

dl.supported_classification_models()

The output of this function will be a list of pixel classification models supported by the AutoDL class.

Supported Detection Models

dl.supported_detection_models()

The output of this function will be a list of object detection models supported by the AutoDL class.

Fit

The fit method will be used to train all of the selected networks automatically within the provided time limit.

dl.fit()

Score

This method will return an evaluation report as a dataframe that will include several fields, including the model's accuracy with train/validation loss, dice(for pixel classification), the learning rate used to train the model, train time, and backbone.

dl.score()

Report

dl.report()

This method will return an advanced html report of the networks trained by AutoDL. In the basic mode it shows the leaderboard of all the networks based on their performance during the training phase, and some important parameter details and charts for the best evaluated model. Additionally, in the advanced mode it shows details of all the optuna based trails with the hyper-tuned parameter details and the feature importance chart for each of the model evaluated during the advanced mode.

Show Results

This method will display the results for the best performing model.

dl.show_results()

MIOU

MIOU is mean of intersection over union, this method calculates mean IOU on the validation set for each class. This function is only supported by pixel classification models.

dl.mIOU()

Average Precision Score

This method computes the average of the precision on the validation set for each class. This function is only supported by object detection models.

dl.average_precision_score()

Fine tune AutoDL models using ImageryModel

Once the best performing network is identified, it can be further fine tuned using the ImageryModel class. This class supports methods that can be used to load, fine-tune, and save the model for further use.

im = ImageryModel()

Load the model

The load method is used to load a saved model from the disk using the AutoDL class. It accepts the following parameters:
- path: Path to the ESRI Model Definition (emd or dlpk) file
- data: Returned data object from prepare_data function

im.load("path_to_emd_file", data)

Learning rate

The lr_find method runs the Learning Rate Finder, which helps in choosing the optimum learning rate for training the model.

im.lr_find()

Train the model

The loaded model can be trained further using the fit method. This method trains the model for the specified number of epochs while using the specified learning rates.

im.fit(10)

Save the model

This method saves the model weights and creates an Esri Model Definition and a Deep Learning Package zip for deployment to Image Server or ArcGIS Pro.

im.save("path_to_save_model")

Conclusion

This guide has explained how the AutoDL class can be used to automate multiple deep learning models supported by the arcgis.learn API. For every step in the workflow, we defined a function and discussed its usage. This guide can be a starting point for developers to train and evaluate multiple arcgis.learn supported models' performances.

For more information about the API, refer to the API reference