An outlier analysis resulting in a new layer with statistically significant clusters and anomalous population values by USA county.
What is an outlier analysis?
An outlier analysis is the process of identifying both clusters and anomalous values (outliers) in spatial data. It determines whether an attribute value or point count for each feature is significantly different, defined as the resultant z-score and p-value, from its neighbors. To execute the analysis, use the spatial analysis service and the FindOutliers operation.
The analysis classifies features as being:
High-High: a high value surrounded by other high values
High-Low: a high value surrounded by low values
Low-High: a low value surrounded by high values
Low-Low: a low value surrounded by other low values.
A feature is part of a cluster when it has a similar value to its neighbors. A feature is considered an outlier when it has dissimilar values from its neighbors.
An outlier analysis helps to find spatial trends and patterns in the data that may not be visible at first glance.
Real-world examples of this analysis include the following:
Finding outliers (either high or low counts) of traffic crashes or crimes.
Determining whether there are outlier (anomalous) spending trends or real estate prices.
Finding whether some areas of a country might have a higher population despite being surrounded by lower population numbers.
How to perform an outlier analysis
The general steps to performing an outlier analysis are as follows:
Review the parameters for the FindOutliers operation.
Send a request to get the spatial analysis service URL.
Execute a job request with the following URL and parameters:
A string representing the name of the hosted feature layer to return with the results. NOTE: If you do not include this parameter, the results are returned as a feature collection (JSON).
{"serviceProperties": {"name": "<SERVICE_NAME>"}}
context
A bounding box or output spatial reference for the analysis.
"extent":{"xmin:", "ymin:", "xmax:", "ymax:"}
Code examples
Identify outliers in traffic crashes
This example uses the FindOutliers operation to determine where there are statistically significant outliers of Traffic crashes counted within a fishnet grid. The anomalous areas, shown as dark red and dark blue, indicate either significantly high or significantly low instances of crashes compared to neighboring clusters.
In the analysis, the analysisLayer value is the Traffic crashes hosted feature layer. The points in the layer are counted within a fishnet, which was set in the shapeType parameter.
Outlier analysis showing clusters and outliers of traffic crashes.
APIs
ArcGIS API for PythonArcGIS API for PythonArcGIS REST JS
This example uses an outlier analysis to determine where there are statistically significant outliers for home values in Portland. The anomalous areas, shown as dark red and dark blue, indicate either significantly high or significantly low home values compared to neighboring clusters.
In the analysis, the analysisLayer value is the Enriched Portland hexagon bins hosted feature layer. The feature layer was created using generated hexagon bins that were enriched using data from the GeoEnrichment service. To analyze home values, you set the analysisField with the AVG_CY attribute.
Learn how to perform related analyses interactively with Map Viewer and programmatically with ArcGIS API for Python, ArcGIS REST JS, and ArcGIS REST API.