Automate Road Surface Investigation Using Deep Learning

🔬 Data Science
🥠 Deep Learning and Object Detection

Introduction and objective

Deterioration of road surface due to factors including vehicle overloading, poor construction quality, over ageing, natural disasters and other climatic conditions may lead to road pavement failure. This may result in traffic slowness causing jams and vehicle damage due to cracks. This also causes problems for civic authorities who are in need to accurately identify these cracks and do the repair work. If these cracks are not repaired at early stages, cost of repair gradually increases causing unnecessary burden on exchequer.

Traditionally, inspection of road surface is done by humans either by visually observing it or by using sophisticated machines which are expensive too. The manual approach to detect damage is not just time consuming but is also ineffective since detection of such damages requires consistent help from subject matter experts who have the ability to identify and differentiate different types of pavement failures. Artificial Intelligence supported by Deep Learning comes to the rescue. Deep learning integrated with ArcGIS plays a crucial role by automating the process.

In this notebook, We use a great labeled dataset of asphalt distress images from the 2018 IEEE Bigdata Cup Challenge in order to train our model to detect as well as to classify type of road cracks. The training and test data consists of 9,053 photographs, collected from smartphone cameras, hand labeled with the presence or absence of 8 road damage categories [1].

The table below shows sample images of the dataset corresponding to each of the 8 categories of damage type.

Class Name	Class Description	Image
D00	Liner, crack, longitudinal, wheel mark part
D01	Liner crack, longitudinal, construction joint part
D10	Liner crack, lateral, equal interval
D11	Liner crack, lateral, construction, joint part
D20	Alligator crack
D40	Rutting, bump, pothole, separation
D43	White line blur
D44	Cross walk blur

Through this sample, we will walk you through step-by-step process to build robust Deep Learning solution to identify road pavement failures and eventually integrate with ArcGIS as a reusable tool.

Necessary imports

Note: This notebook sample has not been verified to run on ArcGIS Pro. It may be possible to execute the sample within ArcGIS Pro if the libraries it requires can be installed using the Package Manager. If the deep learning frameworks are required, follow this documentation for more guidance.

# Restart the kernel after installation is complete
!pip install opencv-python==4.0.1.24

import pandas as pd
import os
import shutil
from pathlib import Path

from arcgis.gis import GIS
from arcgis.features import GeoAccessor
from arcgis.learn import SingleShotDetector, prepare_data

Prepare data that will be used for training

You can download pavement cracks data from the following link: https://developers.arcgis.com/python/sample-notebooks/automate-road-surface-investigation-using-deep-learning/. Extract the downloaded file and run the code below to prepare data in a format that deep learning models expect.

# Please uncomment the following code to prepare your training data.

# input_path = Path(input("Enter the path where you extracted data: "))
# output_path = Path(input("Enter the path where you want to create training data: "))
# try:
#     if not os.path.exists(output_path/'images') and os.path.exists(output_path/'labels'):
#         os.mkdir(output_path/'images')
#         os.mkdir(output_path/'labels')
# except: raise
# for fl in os.listdir(input_path):
#     if not(fl.startswith(".")):
#         for f in os.listdir(input_path/fl/'Annotations'):
#             if not(f.startswith(".")):
#                 img_name = f.split('.')[0] + '.jpg'
                
#                 shutil.copyfile(input_path/fl/'JPEGImages'/img_name, output_path/'images'/img_name)
#                 shutil.copyfile(input_path/fl/'Annotations'/f, output_path/'labels'/f)

Model training

You change the path to your own training data folder that contains "images" and "labels" folder.

gis = GIS('home')

training_data = gis.content.get('9c7274bbfac343f3aef33f2dc1ff4baf')
training_data

automate_road_surface_investigation_using_deep_learning

Image Collection by api_data_owner
Last Modified: August 25, 2020
0 comments, 4 views

filepath = training_data.download(file_name=training_data.name)

import zipfile
with zipfile.ZipFile(filepath, 'r') as zip_ref:
    zip_ref.extractall(Path(filepath).parent)

data_path = Path(os.path.join(os.path.splitext(filepath)[0]))

prepare_data function takes path to training data and creates a fastai databunch with specified transformation, batch size, split percentage,etc.

data = prepare_data(data_path,
                    batch_size=8,
                    chip_size=500,
                    seed=42,
                    dataset_type='PASCAL_VOC_rectangles')

We can use the classes attribute of the data object to get information about the number of classes.

data.classes

['background', 'D00', 'D01', 'D10', 'D11', 'D20', 'D30', 'D40', 'D43', 'D44']

Visualize training data

To get a sense of what the training data looks like, arcgis.learn.show_batch() method randomly picks a few training chips and visualize them.

data.show_batch(rows=2)

Load model architecture

arcgis.learn provides the SingleShotDetector (SSD) model for object detection tasks, which is based on a pretrained convnet, like ResNet that acts as the 'backbone'. More details about SSD can be found here.

We will use the SingleShotDetector to train the damage detection model with backbones as resnet101.

ssd = SingleShotDetector(data, backbone='resnet101',focal_loss=True)

Let us have a look at the results of the untrained model.

ssd.show_results(thresh=0.2)

We see that the model is randomly detecting the road cracks. In order to give good results our model needs to be trained.

Learning rate is one of the most important hyperparameters in model training. We will use the lr_find() method to find an optimum learning rate at which we can train a robust model fast enough.

lr = ssd.lr_find()
lr

0.0008317637711026709

Train a model

Based on the suggested learning rate above, we will start training our model with 30 epochs for the sake of time.

ssd.fit(30, lr=lr)

epoch	train_loss	valid_loss	time
0	3.916756	3.534273	07:47
1	2.189290	1.960024	07:48
2	1.913575	1.737066	07:45
3	1.727438	1.574788	07:44
4	1.587650	2.134769	07:31
5	1.508131	1.415902	07:46
6	1.400807	5.037269	07:48
7	1.382145	1.719041	07:48
8	1.375488	4.048904	07:43
9	1.303755	1.848563	07:50
10	1.280773	1.222865	07:44
11	1.252260	1.214416	07:47
12	1.217753	1.236139	07:30
13	1.239035	1.161670	07:39
14	1.237716	1.127153	07:29
15	1.147980	1.103687	07:47
16	1.161228	1.105242	07:43
17	1.159945	1.075735	07:45
18	1.071214	1.058415	07:44
19	1.093338	1.065908	07:31
20	1.099237	1.042938	07:45
21	1.114819	1.041307	05:45
22	1.060352	1.031148	04:09
23	1.021770	1.024204	04:10
24	1.056092	1.101342	04:09
25	1.022077	1.014639	04:10
26	1.018347	1.020852	04:10
27	1.035899	1.017190	04:10
28	1.017030	1.005037	04:10
29	1.007083	1.005612	04:12

The graph below plots training and validation losses.

ssd.learn.recorder.plot_losses()

average_precision_score method computes average precision on the validation set for each class.

ssd.average_precision_score()

{'D00': 0.5585359730352724,
 'D01': 0.7302843487881194,
 'D10': 0.2577634234076642,
 'D11': 0.14445632230490446,
 'D20': 0.7623061137618858,
 'D30': 0.0,
 'D40': 0.16982323703158553,
 'D43': 0.813513353090408,
 'D44': 0.6490994066172426}

We can see the model accuracy for each class of our validation data. The model is giving varying results. Let's us dig deeper to find the reason for model to preform better on one class in comparison to the other. This will also help us understand why D30 class has zero average precision score.

# Calculate the number of images of each classs in training data
all_classes = []
for i, bb in enumerate(data.train_ds.y):
    all_classes += bb.data[1].tolist()
    
df = pd.value_counts(all_classes, sort=False)
df.index = [data.classes[i] for i in df.index] 
df

D43     753
D00    2477
D44    3369
D01    3418
D10     677
D11     574
D20    2290
D30      22
D40     369
dtype: int64

We have only 22 images for training our model to detect class D30 which is very less. Thus, the model is giving poor score for this specific class.

Detect and visualize pavement cracks in validation set

ssd.show_results(rows=10, thresh=0.2, nms_overlap=0.5)

Save the model

As we can see, with 30 epochs, we are already seeing reasonable results. Further improvment can be acheived through more sophisticated hyperparameter tuning. Let's save the model for further training or inference later. The model should be saved into a models folder in your folder. By default, it will be saved into your data_path that you specified in the very beginning of this notebook.

ssd.save(str(data_path / 'pavement-cracks-model-resnet101'))

Model inference

We will do model inference using the two methods: predict and predict_video. Let's get the data required to predict on image and video.

inference_data = gis.content.get('92a75cec191e4dbbb53067761287b977')
inference_data

pavement_cracks_data_inference

Image Collection by api_data_owner
Last Modified: November 23, 2020
0 comments, 0 views

inf_data_path = inference_data.download(file_name=inference_data.name)

import zipfile
with zipfile.ZipFile(inf_data_path, 'r') as zip_ref:
    zip_ref.extractall(Path(inf_data_path).parent)

img_file = os.path.join(os.path.splitext(inf_data_path)[0], 'test_img.jpg')
video_file = os.path.join(os.path.splitext(inf_data_path)[0], 'test_video.mp4')
metadata_file = os.path.join(os.path.splitext(inf_data_path)[0], 'metadata.csv')

Detecting pavement cracks on an image

bbox_data = ssd.predict(img_file, threshold=0.1, visualize=True)

Detecting pavement cracks from video feed

ssd.predict_video(input_video_path=video_file, 
                  metadata_file=metadata_file, 
                  visualize=True, 
                  resize=True)

100.00% [11295/11295 13:43<00:00]

	UNIX Time Stamp	Sensor Latitude	Sensor Longitude	Sensor True Altitude	Frame Center Latitude	Frame Center Longitude	Frame Center Elevation	vmtilocaldataset
0	1.564889e+15	28.412995	77.162906	294.736633	28.412995	77.162906	294.736633	\n
1	1.564889e+15	28.412995	77.162904	294.729656	28.412995	77.162904	294.729656	\n
2	1.564889e+15	28.412996	77.162902	294.722680	28.412996	77.162902	294.722680	\n
3	1.564889e+15	28.412997	77.162901	294.715703	28.412997	77.162901	294.715703	\n
4	1.564889e+15	28.412997	77.162899	294.708546	28.412997	77.162899	294.708546	\n
...	...	...	...	...	...	...	...	...
1794	1.564889e+15	28.414831	77.159583	278.085715	28.414831	77.159583	278.085715	NaN
1795	1.564889e+15	28.414831	77.159581	278.076278	28.414831	77.159581	278.076278	NaN
1796	1.564889e+15	28.414830	77.159578	278.066841	28.414830	77.159578	278.066841	NaN
1797	1.564889e+15	28.414830	77.159576	278.057561	28.414830	77.159576	278.057561	NaN
1798	1.564889e+15	28.414829	77.159574	278.047585	28.414829	77.159574	278.047585	NaN