ArcGIS REST API

Describe Dataset

Describe Dataset workflow diagram

The Describe Dataset task provides an overview of your big data. By default, the tool outputs a table layer containing calculated field statistics and a JSON string outlining geometry and time settings for the input layer.

Optionally, the tool can also output a feature layer representing a sample of your input features or a single polygon feature layer that represents the extent of your input features. You can choose to output one, both, or none.

For example, imagine you are tasked with completing an analysis workflow on a large volume of data. You want to try the workflow, but it could take hours or days with your full dataset. Instead of using time and resources running the full analysis, first create a sample layer to efficiently test your workflow before running it on the full dataset.

Note:

Describe Dataset was introduced in ArcGIS Enterprise 10.7.

Request URL

https://<analysis URL>/DescribeDataset/submitJob

Request parameters

ParameterDescription

inputLayer

(Required)

The table, point, line, or polygon feature layer that will be described, summarized, and sampled.

Syntax: As described in Feature input, this parameter can be one of the following:

  • A URL to a feature service layer with an optional filter to select specific features
  • A URL to a big data catalog service layer with an optional filter to select specific features
  • A feature collection

REST web example:

  • {"url" : "https://myportal.domain.com/server/rest/services/Hosted/hurricaneTrack/FeatureServer/0", "filter": "Month = 'September'"}

REST scripting example:

  • "inputLayer" : {"url": "https://myportal.domain.com/server/rest/services/Hosted/hurricaneTrack/FeatureServer/0", "filter": "Month = 'September'"}

sampleSize

The task will output a feature layer representing a sample of features from the inputLayer. Specify the number of sample features to return. If the input value is null, 0, or empty, then no sample layer will be created. The output will have the same schema, geometry, and time type as the input layer.

The default is null.

REST web example: 450

REST scripting example: "sampleSize": 450

extentOutput

The task will output a single rectangle feature representing the extent of the inputLayer if this value is set to true.

The default is false.

Values: true | false

REST web example: true

REST scripting example:"extentOutput" : true

outputName

This value is required when you choose to output an extent feature layer or sample feature layer. The task will create a feature service of the resulting layers. You define the name of the service.

REST web example: myOutput

REST scripting example: "outputName" : "myOutput"

context

The context parameter contains additional settings that affect task execution. For this task, there are four settings:

  • Extent (extent)—A bounding box that defines the analysis area. Only those features that intersect the bounding box will be analyzed.
  • Processing spatial reference (processSR)—The features will be projected into this coordinate system for analysis.
  • Output spatial reference (outSR)—The features will be projected into this coordinate system after the analysis to be saved. The output spatial reference for the spatiotemporal big data store is always WGS84.
  • Data store (dataStore)—Results will be saved to the specified data store. The default is the spatiotemporal big data store.

f

The response format. The default response format is html.

Values: html | json

Response

When you submit a request, the service assigns a unique job ID for the transaction.

Syntax:
{
"jobId": "<unique job identifier>",
"jobStatus": "<job status>"
}

https://<analysis url>/DescribeDataset/jobs/<jobId>

Accessing results

When the status of the job request is esriJobSucceeded, you can access the results of the analysis by making a request of the following form:

https://<analysis url>/DescribeDataset/jobs/<jobId>/results/outputJSON?token=<your token>&f=json
https://<analysis url>/DescribeDataset/jobs/<jobId>/results/output?token=<your token>&f=json
https://<analysis url>/DescribeDataset/jobs/<jobId>/results/extentLayer?token=<your token>&f=json
https://<analysis url>/DescribeDataset/jobs/<jobId>/results/sampleLayer?token=<your token>&f=json
https://<analysis url>/DescribeDataset/jobs/<jobId>/results/processInfo?token=<your token>&f=json

ParameterDescription

outputJSON

outputJSON returns a JSON that details the properties of the input layer.

The following characteristics will be defined in the output JSON:

  • datasetName—The name of the inputLayer. In the following example, the input layer name is my_bigdata_dataset.
  • datasetSource—The storage location for the input dataset. This could be one of the following: ArcGIS Data Store - Relational, ArcGIS Data Store - Spatiotemporal, Big Data File Share - <your big data file share name>, Feature Collection, or Remote Feature Service. In the following example, the dataset source is a big data file share named my_registered_file_share.
  • recordCount—The count of nonempty input records. The output below shows the input layer has 234 records
  • geometry—A list of input layer geometry settings including geometry type, spatial reference, spatial extent, and record counts. In the following example, the input layer has point geometry, a spatial reference of 4326, and 6 records do not have a geometry.
  • time—A list of input layer time settings including time type, record counts, and temporal extent. In the following example the input features have time of type interval and 4 of the time values are empty.

Example:
{"url": 
"https://<analysis url>/DescribeDataset/jobs/<jobId>/results/outputJSON"}

The result has properties for parameter name, data type, and value. The value property is a JSON that defines general inputLayer characteristics.

{	
    "paramName": "outputJSON",
    "dataType": "GPString",	
    "value": {
        "datasetName": "my_bigdata_dataset",	
        "datasetSource": "Big Data File Share - my_registered_file_share",
        "recordCount": 236,		
        "geometry": {
            "geometryType": "Point",			
            "sref": {"wkid": 4326},			
            "countNonEmpty": 230,			
            "countEmpty": 6,			
            "spatialExtent": {
                "xmin": 895229.0608758491,
                "ymin": 557949.5851721496,				
                "xmax": 915995.4702218114,				
                "ymax": 597425.2187718959			
            }
        },
        "time": {
            "timeType": "Interval",
            "countNonEmpty": 232,			
            "countEmpty": 4,		
           	"temporalExtent":{
                "startTime": 1420059600000,				
                "endTime": 1420070280000			
            }
        }
    }
}

See Feature output for more information about how the result layer is accessed.

output

By default output will return a table of field statistics.

For numeric fields, the following statistics will be calculated:

  • Count—Totals the number of values of all the features in the field.
  • Sum—Calculates the total value of all the features in the field.
  • Mean—Calculates the average of all the features in the field.
  • Min—Finds the smallest value of all the features in the field.
  • Max—Finds the largest value of all the features in the field.
  • Range—Finds the difference between the Min and Max values.
  • Stddev—Finds the standard deviation of all the features in the field.
  • Var—Finds the variance of all the features in the field.

For date fields, the following statistics will be calculated:

  • Count—Totals the number of values of all the features in the field.
  • Min—Finds the earliest date value of all the features in the field.
  • Max—Finds the latest date value of all the features in the field.
  • Range—Finds the difference between the Min and Max date values.

For string fields, the following statistics will be calculated:

  • Count—Totals the number of strings for all the features in the field.
  • Any—Returns a sample string of features in the field.

Example:
{"url": 
"https://<analysis url>/DescribeDataset/jobs/<jobId>/results/output"}

The result has properties for parameter name, data type, and value. The contents of value depend on the outputName parameter provided in the initial request. The value contains the URL of the feature service layer.

{
"paramName":"output", 
"dataType":"GPRecordSet",
"value":{"url":"<hosted feature service layer url>"}
}

extentLayer

Setting extentLayer to true returns a single polygon feature equal to the extent of the input features. Context settings will be used while creating this layer.

Example:
{"url": 
"https://<analysis url>/DescribeDataset/jobs/<jobId>/results/extentLayer"}

The result has properties for parameter name, data type, and value. The contents of value depend on the outputName parameter provided in the initial request. The value contains the URL of the feature service layer.

{
"paramName":"extentLayer", 
"dataType":"GPRecordSet",
"value":{"url":"<hosted feature service layer url>"}
}

See Feature output for more information about how the result layer is accessed.

sampleLayer

sampleLayer returns a subset of the input layer as a feature layer with the same geometry type, time type, and schema as the input. Context settings will be used while creating this output layer. This layer is only output if the sampleSize value is set to 1 or greater.

Example:
{"url": 
"https://<analysis url>/DescribeDataset/jobs/<jobId>/results/sampleLayer"}

The result has properties for parameter name, data type, and value. The contents of value depend on the outputName parameter provided in the initial request. The value contains the URL of the feature service layer.

{
"paramName":"sampleLayer", 
"dataType":"GPRecordSet",
"value":{"url":"<hosted feature service layer url>"}
}

See Feature output for more information about how the result layer is accessed.

processInfo

The processInfo output contains strings that summarize the Describe Dataset result. These strings are used for reporting by the Describe Dataset tool in the portal's Map Viewer. You can create your own custom reports for your application using these strings. There are four parts in the returned JSON:

  • messageCode—The serial number for each unique message.
  • message—Text that may or may not contain parameters (in ${paramsName} format) that need to be replaced by values.
  • params—A dictionary of the keys and values to be inserted into the ${paramsName} parameter in the message.
  • style—The formatting of the report produced by the Describe Dataset tool in the map viewer.
Example:
{
"messageCode" : "BD_101220",
"message" : ["Dataset name","MY_DATASET_NAME"],
"params" : {},
"style" : "<table><tr><th></th><th></th><th></th><th></th><th></th><th></th></tr>" ,,
}