Training a deep learning model is a long and iterative process and, hence, it is important to have a tool to visualize the progress of the model training and monitor the learning process.
TensorBoard is an open source toolkit which enables us to understand training progress and improve model performance by updating the hyperparameters.
TensorBoard toolkit displays a dashboard where the logs can be visualized as graphs, images, histograms, embeddings, text etc. It also helps in tracking information like gradients, losses, metrics, and intermediate outputs [1, 2].
arcgis.learn module integrates TensorBoard toolkit to the model training process which now makes it possible for us to monitor model training process. In this guide, we will learn how model training can be monitored using
TensorBoard is supported in ArcGIS API for Python version 1.8.3 and later.
The specific Python libraries mentioned below need to be installed in your deep learning environment.
pip install tensorboard=2.2.1
pip install tensorboardX=2.1
arcis.learn module currently supports
TensorBoard for the following models listed:
from arcgis.learn import UnetClassifier , prepare_data
data_path = r'training_data' data = prepare_data(data_path, batch_size=4)
unet_model = UnetClassifier(data) # Choose the model you want to use for training from the above mentioned list
After instantiating the model object, we now train the model using
model.fit() method along with
TensorBoard flag set to True , we can train the model for specified number for epochs while also visualizing it using
TensorBoard. By default,the
TensorBoard parameter is set to False.
unet_model.fit(2, lr=0.0001, tensorboard=True)
Monitor training on Tensorboard using the following command: 'tensorboard --host=DELDEVAL047 --logdir="C:\Users\Karthik\Desktop\Base\Tensorboard\Kent_LULC\training_log"'
The command that needs to be run to access the TensorBoard is printed as shown above when the TensorBoard flag is enabled. If the user does not have the libraries installed which are mentioned in the Prerequisite, the model training continues. However, a warning message will be displayed that prompts the user to install the required libraries.
To Visualize the TensorBoard on your default web browser, the command printed during the training phase should be executed on an anaconda prompt as shown below and the user will get a message as shown
It is possible to run TensorBoard on a different port by passing the required port number in the command (Ex: --port=8008). The default port used is port 6006.
The TensorBoard is now accessible on any web browser by typing the URL that gets printed when TensorBoard command is executed. (Highlighted above). Doing this will open up TensorBoard on the URL:
- In the tab 'SCALARS' various graphs related to different metrics and stats can be visualized.
- In the tab 'IMAGES' the intermittent outputs of the model get displayed as shown below. Using this feature, we can compare the outputs of the model across different epochs and compare visually the model outputs across different runs of the model.
This can be done even while the training process is ongoing as these graphs and images get updated at the end of each epoch and does not wait until the entire training process to get completed.