Skip To Content ArcGIS for Developers Sign In Dashboard

Export Training Data For Deep Learning


Export Training Data For Deep Learning

The ExportTrainingDataforDeepLearning operation is designed to generate training sample image chips from the input imagery data with labeled vector data or classified images. The output of this service tool is the data store string where the output image chips, labels, and metadata files are going to be stored.

Request parameters



The portal item Id, image service URL, cloud raster dataset, or shared raster dataset that will be classified. At least one type of input needs to be provided in the JSON object. If multiple inputs are given, the itemId takes priority.

Syntax: JSON object describes the input raster.


{"itemId": <portal item id>}
{"url": <image service url>}
   "serviceUrl":"https://<server name>/server/rest/services/Hosted/testrasteranalysis/ImageServer"},
   "itemProperties":{"itemId":"8cfbd3ec25584d0d8fed23b8ff7c43b", "folderId":"sdfwerfbd3ec25584d0d8f4"}



This is the output location for training sample data. It can be just the output folder name, or the path of the output location on the file share raster data store, or a shared file system path.

Output folder name example:


File share raster store path example:


File share path example:




Labeled data, either a feature service or image service. Vector inputs should follow a training sample format as generated by the ArcGIS Pro Training Sample Manager, whereas raster inputs should follow a classified raster format as generated by the Classify Raster tool.


{"itemId": <portal item id>}
{"url": <image or feature service url>}
   "serviceUrl":"https://<server name>/server/rest/services/Hosted/testrasteranalysis/ImageServer"},
   "itemProperties":{"itemId":"8cfbd3ec25584d0d8fed23b8ff7c43b", "folderId":"sdfwerfbd3ec25584d0d8f4"}


The raster format for the image chip outputs.

Values: TIFF | PNG | JPEG | MRF (Meta Raster Format)




The size of the image chips.


{"x": 256, "y": 256}


The distance to move in the x and y when creating the next image chip. When stride is equal to the tile size, there will be no overlap. When stride is equal to half of the tile size, there will be 50 percent overlap.


{"x": 128, "y": 128}


The format of the output metadata labels.

The five options for output metadata labels for the training data are KITTI rectangles, PASCAL VOC rectangles, Classified Tiles (a class map), RCNN Masks, and Labeled Tiles. If your input training sample data is a feature class layer, such as a building layer or standard classification training sample file, use the KITTI or PASCAL VOC rectangles option. The output metadata is a .txt file or .xml file containing the training sample data contained in the minimum bounding rectangle. The name of the metadata file matches the input source image name. If your input training sample data is a class map, use the Classified Tiles option as your output metadata format.


  • KITTI_rectangles (Default): The metadata follows the same format as the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) Object Detection Evaluation dataset. The KITTI dataset is a vision benchmark suite. The label files are plain text files. All values, both numerical and strings, are separated by spaces, and each row corresponds to one object. For more information, see KITTI metadata format.
  • PASCAL_VOC_rectangles: The metadata follows the same format as the Pattern Analysis, Statistical Modeling and Computational Learning, Visual Object Classes (PASCAL_VOC) dataset. The PASCAL VOC dataset is a standardized image dataset for object class recognition. The label files are XML files and contain information about image name, class value, and bounding boxes. For more information, see PASCAL Visual Object Classes.
  • Classified_Tiles: This option will output one classified image chip per input image chip. No other metadata for each image chip is used. Only the statistics output has more information on the classes, such as class names, class values, and output statistics.
  • RCNN_Masks: This option will output image chips that have a mask on the areas where the sample exists. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone in the deep learning framework model.
  • Labeled_Tiles: Each output tile will be labeled with a specific class.

PASCAL_VOC_rectangles example:

<?xml version=”1.0”?>
- <layout>
    - <part>
       - <bndbox>


The field that contains the class values. If no field is specified, the system searches for a value or classvalue field. If the feature does not contain a class field, the system determines that all records belong to one class.




The radius for a buffer around each training sample to delineate a training sample area. This allows you to create circular polygon training samples from points.




A polygon feature class that delineates the area where image chips will be created.

Only image chips that fall completely within the polygons will be created.


{"itemId": <portal item id>}
{"url": <feature service url>}


The rotation angle that will be used to generate additional image chips.

An image chip will be generated with a rotation angle of 0, which means no rotation. It will then be rotated at the specified angle to create an additional image chip. The same training samples will be captured at multiple angles in multiple image chips for data augmentation.

The default rotation angle is 0.




Contains settings that affect task execution. This task has the following settings:

  • Extent (extent)—A bounding box that defines the analysis area.
  • Cell Size (cellSize)—The output raster will have the resolution specified by cell size.
  • Export All Tiles (exportAllTiles)—Choose if the training sample image chips with overlapped label data will be exported.
    • True—Export all the image chips, including those that do not overlap labeled data. This is the default.
    • False—Export only the image chips that overlap the labeled data.


    {"exportAllTiles" : true}

  • Start Index (startIndex)—Allows you to set the start index for the sequence of image chips. This appends more image chips to an existing sequence. The default value is 0.


    {"startIndex": 0 }


The response format. The default response format is html.

Values: html | json

Additional KITTI metadata format information

The table below describes the 15 values in the KITTI metadata format. Only 5 of the possible 15 values are used in the tool: the class name (in column 1) and the minimum bounding rectangle composed of four image coordinate locations (columns 5–8). The minimum bounding rectangle encompasses the training chip used in the deep learning classifier.



Class value

The class value of the object listed in the stats.txt file.






The two-dimensional bounding box of objects in the image, based on a 0-based image space coordinate index. The bounding box contains the four coordinates for the left, top, right, and bottom pixels.




When you submit a request, the task assigns a unique job ID for the transaction.


"jobId": "<unique job identifier>",
"jobStatus": "<job status>"

After the initial request is submitted, you can use the jobId to periodically check the status of the job and messages as described in Checking job status. Once the job has successfully completed, you use the jobId to retrieve the results. To track the status, you can make a request of the following form:

https://<raster analysis tools url>/ExportTrainingDataforDeepLearning/jobs/<jobId>

When the status of the job request is esriJobSucceeded, you can access the results of the analysis by making a request of the following form:

https://<raster analysis tools url>/ExportTrainingDataforDeepLearning/jobs/<jobId>/results/outLocation

Example usage

Below is a sample request URL for ExportTrainingDataforDeepLearning.

JSON Response example

The response returns the outLocation parameter, which has properties for parameter name, data type, and value. The content of the value is always the output data store item's itemId or URL. The parameter provides the output location of the training data.

 "paramName": "outLocation",
 "dataType": "GPString",
 "value": {
   "uri": "/rasterStores/myrasterstore/rooftops"