How AutoDL Works
This guide explains the steps for the training and evaluation of multiple network architectures supported by the
arcgis.learn API. The
arcgis.learn API currently supports more than 30 deep learning networks for object detection. The API provides 4 deep learning networks, along with the
MMDetection class, which supports more than twenty object detection networks. Similarly, for pixel classification, the API supports 11 deep learning networks, along with the
MMSegmentation class, which supports more than twenty pixel classification networks.
To train a deep learning network using the
arcgis.learn API, you must follow the complete pipeline, which involves data preprocessing, network selection, hyper parameter tuning, and network selection/evaluation based on the performance of the network.
As it can be difficult to iteratively run and compare all of the different networks, the
AutoDL class automatically trains all of the supported networks with the given data within a provided time limit and provides a performance tally for all of the networks. The
AutoDL class will also save all of the networks during the training process, allowing them to be used later for fine tuning to enhance the network performance.
Let's see how
AutoDL class works with
from arcgis.learn import AutoDL, prepare_data, ImageryModel
Prepare data for
AutoDL class using
data = prepare_data("path_to_data_folder", batch_size=2)
AutoDL class accepts the following paramters:
data(Required Parameter): The data object returned from the
prepare_datafunction in the previous step.
total_time_limit(Optional parameter): The total time limit in hours for the
AutoDLto train and evaluate the networks. This parameter becomes important when time is the main constraint to the user. The
AutoDLclass calculates the number of chips that can be processed in the given time from the prepared databunch.
mode(Optional Parameter): Can be "basic" or "advanced".
- basic : To to be used when the user wants to train all selected networks.
- advanced : To be used when the user also wants to tune hyper parameters of the two best performing models from the basic mode.
networks(Optional Parameter): The list of models that will be used in the training process. If the user does not provide the parameter value, the
AutoDLclass selects all of the supported networks, however the user can select one or more networks by passing the network names in a list.
- Supported Object Detection models
- SingleShotDetector, RetinaNet, FasterRCNN, YOLOv3, MMDetection
- Supported Object Detection models
- DeepLab, UnetClassifier, PSPNetClassifier, MMSegmentation
- Supported Object Detection models
verbose(Optional Parameter): Optional Boolean. To be used to display logs while training the networks. This parameter displays the progress with time and becomes important in case of any failure, user can use the logs to check which, when and why network training failed.
- In this mode we iterate through all of the supported networks once with the default backbone, train it with the passed data, and calculate the network performance. At the end of each iteration, the function will save the model to the disk. The maximum number of epochs to train each network is 20, however, if the remaining time left to process the network is less than than the expected time (minimum time required to train the network), the program will automatically calculate the maximum number of epochs to train the network.
- To be used when the user wants to tune the hyper parameters of two best performing networks from the basic mode. This mode will divide the total time into two halves. In the first half, it works like basic mode, where it will iterate through all of the supported networks once. In the second half, it checks for the two best performing networks. The program then trains the selected networks with different supported backbones. At the end of each iteration, the function will save the model to the disk. The maximum number of epochs to train each network is 20, however, if the remaining time left to process the network is less than the expected time (minimum time required to train the network), the program will automatically calculate the number of epochs to train the network.
dl = AutoDL(data, total_time_limit=5,verbose=True, mode="advanced")
AutoDL class is initialized, it calculates the number of images that can be processed in the given time and the time required to process the all of the data. The output of the cell above can then be used to analyze and update the
networks parameters while initializing the class.
Here is an example of the output
- Given time to process the dataset is: 5.0 hours
- Number of images that can be processed in the given time: 290
- Time required to process the entire dataset of 3000 images is 52 hours
This explains how many images can be processed to train all of the selected networks in the selected mode within the given time, as well as provides an estimate of the time that
AutoDL will take to train all of the selected networks with the entire dataset.
The output of this function will be a list of pixel classification models supported by the
The output of this function will be a list of object detection models supported by the
fit method will be used to train all of the selected networks automatically within the provided time limit.
This method will return an evaluation report as a dataframe that will include several fields, including the model's accuracy with train/validation loss, dice(for pixel classification), the learning rate used to train the model, train time, and backbone.
This method will display the results for the best performing model.
MIOU is mean of intersection over union, this method calculates mean IOU on the validation set for each class. This function is only supported by pixel classification models.
This method computes the average of the precision on the validation set for each class. This function is only supported by object detection models.
Once the best performing network is identified, it can be further fine tuned using the
ImageryModel class. This class supports methods that can be used to load, fine-tune, and save the model for further use.
im = ImageryModel()
- The load method is used to load a saved model from the disk using the
AutoDLclass. It accepts the following parameters:
- path: Path to the ESRI Model Definition (.emd) file
- data: Returned data object from
lr_find method runs the Learning Rate Finder, which helps in choosing the optimum learning rate for training the model.
The loaded model can be trained further using the fit method. This method trains the model for the specified number of epochs while using the specified learning rates.
This method saves the model weights and creates an Esri Model Definition and a Deep Learning Package zip for deployment to Image Server or ArcGIS Pro.
This guide has explained how the
AutoDL class can be used to automate multiple deep learning models supported by the
arcgis.learn API. For every step in the workflow, we defined a function and discussed its usage. This guide can be a starting point for developers to train and evaluate multiple
arcgis.learn supported models' performances.
For more information about the API, refer to the API reference