ArcGIS Developers

ArcGIS API for Python

Object Tracking with arcgis.learn

Object tracking is the process of:

  • Taking an initial set of object detections (such as an input set of bounding box coordinates)
  • Creating a unique ID for each of the initial detections
  • And then tracking each of the objects as they move around frames in a video, maintaining the assignment of unique IDs

Multiple-objects tracking can be performed using predict_video function of the arcgis.learn module.


  • Please refer to the prerequisites section in our guide for more information. This sample demonstrates how to do object tracking using arcgis.learn.
  • Please refer to guide to understand how object detection works.

How Object Tracking Works

Object tracking in arcgis.learn is based SORT(Simple Online Realtime Tracking) Algorithm. This Algorithm combines Kalman-filtering and Hungarian Assignment Algorithm

Kalman Filter is used to estimate the position of a tracker while Hungarian Algorithm is used to assign trackers to a new detection. Following sections briefly describe Kalman Filter and Hungarian Algorithm.

Kalman Filter

Kalman filtering uses a series of measurements observed over time and produces estimates of unknown variables by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, one of the primary developers of its theory.

Our state contains 8 variables; (u,v,a,h,u’,v’,a’,h’) where (u,v) are centres of the bounding boxes, a is the aspect ratio and h, the height of the image. The other variables are the respective velocities of the variables.

A Kalman Filter is used on every bounding box, so it comes after a box has been matched with a tracker. When the association is made, predict and update functions are called.


Prediction step is matrix multiplication that will tell us the position of our bounding box at time t based on its position at time t-1.


Update phase is a correction step. It includes the new measurement from the Object Detection model and helps improve our filter.

Hungarian Assignment Algorithm

The Hungarian algorithm, also known as Kuhn-Munkres algorithm, can associate an obstacle from one frame to another, based on a score such as Intersection over Union (IoU).

We iterate through the list of trackers and detections and assign a tracker to each detection on the basis of IoU scores.

The general process is to detect obstacles using an object detection algorithm, match these bounding box with former bounding boxes we have using The Hungarian Algorithm and then predict future bounding box positions or actual positions using Kalman Filters.

Track Objects Using arcgis.learn

Multiple-object tracking can be performed using predict_video function of the arcgis.learn module. To enable tracking, set the track parameter in the predict_video function as track = True.

The following options/parameters are available in the predict video function for the user to decide:-

  • vanish_frames i.e. the number of frames the object remains absent from the frame to be considered as vanished.

  • detect_frames i.e. the number of frames an object remains present in the frame to start tracking.

  • assignment_iou_thrd i.e. There might be multiple trackers detecting and tracking objects. The Intersection over Union (iou) threshold can be set to assign a tracker with the mentioned threshold value.

Vehicle Tracking Example

The following video has been created using predict_video() function of a Retinanet model from arcgis.learn.

The data is collected from a lamp post in Berlin.

In [6]:
from IPython.display import HTML
    <video alt="test" controls>
        <source src="data/test_predictions.mp4" type="video/mp4">
In [ ]:

Feedback on this topic?