Skip To Content
ArcGIS Developers

GeoAnalytics Tools in Run Python Script

The Run Python Script task allows you to programmatically execute most GeoAnalytics Tools with Python using an API that is available when you run the task. A geoanalytics object is instantiated automatically and gives you access to each tool using the syntax shown in the example and table below. Each tool accepts input layers as Spark DataFrames and will return results as a Spark DataFrame or collection of Spark DataFrames. To learn more, see Reading and writing layers in pyspark. DataFrames are held in memory and can be written to a data store at any time. This allows you to chain together multiple GeoAnalytics Tools without writing out intermediate results.


The API described in this topic can only be used within the Run Python Script task and should not be confused with the ArcGIS API for Python, which uses a different syntax to execute stand-alone GeoAnalytics Tools and is intended for use outside of the Run Python Script task.

In the example below, the Detect Incidents task and Find Hot Spots task are used together and only the final DataFrame is written to a data store as a feature service layer. The input layer (represented in the example below by layers[0]) is a big data file share dataset of city bus locations recorded at 1-minute intervals for 15 days. To learn more about using layers, see Reading and writing layers in pyspark.

Chaining together GeoAnalytics Tools with DataFrames

import time

# Run Detect Incidents to find all bus locations where delay status has changed from False to True
exp = "var dly = $track.field[\"dly\"].history(-2); return dly[0]==\"False\" && dly[1]==\"True\""
delay_incidents = geoanalytics.detect_incidents(input_layer = layers[0], track_fields = ["vid"], start_condition_expression = exp, output_mode = "Incidents")

# Use the resulting DataFrame as input to the Find Hot Spots task
delay_hotspots = geoanalytics.find_hot_spots(point_layer = delay_incidents, bin_size = 0.1, bin_size_unit = "Miles", neighborhood_distance = 1, neighborhood_distance_unit = "Miles", time_step_interval = 1, time_step_interval_unit = "Days")

# Write the Find Hot Spots result to the spatiotemporal big data store

For more examples, see Examples: Scripting custom analysis with the Run Python Script task.

The table below describes the method signature for GeoAnalytics Tools in Run Python Script. All tools can be called except for Copy To Data Store and Append Data. The parameter syntax is the same as that of the REST API except where noted. See the documentation for each tool for descriptions of parameter syntax and tool outputs.

For all tool methods with time_step_repeat and time_step_repeat_unit arguments, these correspond to the timeStepRepeatInterval and timeStepRepeatIntervalUnit REST parameters, respectively.





Aggregate Points

aggregate_points(point_layer, bin_type = None, bin_size = None, bin_size_unit = None, polygon_layer = None, time_step_interval = None, time_step_interval_unit = None, time_step_repeat = None, time_step_repeat_unit = None, time_step_reference = None, summary_fields = None)


Build Multi-Variable Grid

build_multi_variable_grid(bin_type = "Square", bin_size = None, bin_size_unit = None, input_layers = None, variable_calculations = None)


input_layers should be list of DataFrames.

Calculate Density

calculate_density(input_layer, fields = None, weight = "Uniform", bin_type = "Square", bin_size = None, bin_size_unit = None, time_step_interval = None, time_step_interval_unit = None, time_step_repeat = None, time_step_repeat_unit = None, time_step_reference = None, radius = None, radius_unit = None, area_units = "SquareKilometers")


Calculate Field

calculate_field(input_layer, field_name, data_type, expression, track_aware = None, track_fields = None, time_boundary_split = None, time_boundary_split_unit = None, time_boundary_reference = None)


Calculate Motion Statistics

calculate_motion_statistics(input_layer, track_fields, track_history_window = 3, motion_statistics = ["All"], idle_distance_tolerance = None, idle_distance_tolerance_unit = None, idle_time_tolerance = None, idle_time_tolerance_unit = None, time_boundary_split = None, time_boundary_split_unit = None, time_boundary_reference = None, distance_method = "Geodesic", distance_unit = "Meters", duration_unit = "Seconds", speed_unit = "MetersPerSecond", acceleration_unit = "MetersPerSecondSquared", elevation_unit = "Meters")


Clip Layer

clip_layer(input_layer, clip_layer)


Create Buffers

create_buffers(input_layer, distance = None, distance_unit = None, field = None, method = "Planar", dissolve_option = "None", dissolve_fields = None, summary_fields = None, multipart = False)


Create Space Time Cube

create_space_time_cube(point_layer, bin_size, bin_size_unit, time_step_interval, time_step_interval_unit, time_step_alignment = None, time_step_reference = None, summary_fields = None, output_name = None)


Returns the local path to the resulting space-time cube on a ArcGIS GeoAnalytics Server machine. The cube is written to a temp directory and will be deleted if not copied to a different location.

Describe Dataset

describe_dataset(input_layer, sample_size = None, extent_output = False)


Example result: {"output":<DataFrame>, "outputJSON":<string>,"extentLayer":<DataFrame>,"sampleLayer":<DataFrame>}

Detect Incidents

detect_incidents(input_layer, track_fields, start_condition_expression, end_condition_expression = None, output_mode = "AllFeatures", time_boundary_split = None, time_boundary_split_unit = None, time_boundary_reference = None)


Dissolve Boundaries

dissolve_boundaries(input_layer, dissolve_fields = None, summary_fields = None, multipart = False)


Enrich From Multi-Variable Grid

enrich_from_multi_variable_grid(input_features, grid_layer, enrich_attributes = None)


Find Dwell Locations

find_dwell_locations(input_layer, track_fields, distance_method = "Planar", distance_tolerance, distance_tolerance_unit, time_tolerance, time_tolerance_unit, summary_fields = None, output_type = "DwellMeanCenters")


Find Hot Spots

find_hot_spots(point_layer, bin_size, bin_size_unit, neighborhood_distance, neighborhood_distance_unit, time_step_interval = None, time_step_interval_unit = None, time_step_alignment = None, time_step_reference = None)


Find Point Clusters

find_point_clusters(input_layer, cluster_method = "DBSCAN", time_method = None, search_duration = None, search_duration_unit = None, min_features_cluster = None, search_distance = None, search_distance_unit = None)


Find Similar Locations

find_similar_locations(input_layer, search_layer, analysis_fields, most_or_least_similar = "MostSimilar", match_method = "AttributeValues", number_of_results = 10, append_fields = None)


Example result: {"output":<DataFrame>, "processInfo":<string>}

Forest-based Classification And Regression

forest_based_classification_and_regression(prediction_type = "Train", in_features = None, features_to_predict = None, variable_predict = None, explanatory_variables = None, number_of_trees = 100, minimum_leaf_size = None, maximum_tree_depth = None, sample_size = 100, random_variables = None, percentage_for_validation = 10, create_variable_importance_table = False, explanatory_variable_matching = None)


Example result: {"outputTrained":<DataFrame>, "variableOfImportance":<DataFrame>,"outputPredicted":<DataFrame>,"processInfo":<string>}

Generalized Linear Regression

generalized_linear_regression(input_layer, features_to_predict = None, dependent_variable = None, explanatory_variables = None, regression_family = "Continuous", generate_coefficient_table = False, explanatory_variable_matching = None, dependent_mapping = None)


Example result: {"output":<DataFrame>, "coefficientTable":<DataFrame>,"outputPredicted":<DataFrame>, "processInfo":<string>}

Geocode Locations

geocode_locations(input_layer, geocode_service_url, geocode_parameters, source_country = None, category = None, include_attributes = None, locator_parameters = None)


Geographically Weighted Regression

geographically_weighted_regression(input_layer, explanatory_variables, dependent_variable, model_type = "Continuous", neighborhood_type = "NumberOfNeighbors", neighborhood_selection_method = "UserDefined", distance_band = None, distance_band_unit = None, number_of_neighbors = None, local_weighting_scheme = "Bisquare")


Join Features

join_features(target_layer, join_layer, join_operation = "JoinOneToOne", join_fields = None, summary_fields = None, spatial_relationship = None, spatial_near_distance = None, spatial_near_distance_unit = None, temporal_relationship = None, temporal_near_distance = None, temporal_near_distance_unit = None, attribute_relationship = None, join_condition = None)


Merge Layers

merge_layers(input_layer, merge_layer, merging_attributes = None)


Overlay Layers

overlay_layers(input_layer, overlay_layer, overlay_type = "Intersect", include_overlaps = True)


Reconstruct Tracks

reconstruct_tracks(input_layer, track_fields, method = "Planar", buffer_field = None, summary_fields = None, time_split = None, time_split_unit = None, distance_split = None, distance_split_unit = None, time_boundary_split = None, time_boundary_split_unit = None, time_boundary_reference = None)


Summarize Attributes

summarize_attributes(input_layer, fields, summary_fields = None)


Summarize Center And Dispersion

summarize_center_and_dispersion(input_layer, summary_type, ellipse_size = None, weight_field = None, group_fields = None)


Example result: {"centralFeatureLayer":<DataFrame>, "meanCenterLayer":<DataFrame>, "medianCenterLayer":<DataFrame>, "ellipseLayer":<DataFrame>}

Summarize Within

summarize_within(summary_polygons = None, bin_type = None, bin_size = None, bin_size_unit = None, summarized_layer = None, standard_summary_fields = None, weighted_summary_fields = None, sum_shape = True, shape_units = None, group_by_field = None, minority_majority = False, percent_shape = False)


Example result: {"output":<DataFrame>, "groupBySummary":<DataFrame>}

Trace Proximity Events

trace_proximity_events(input_points, entity_id_field, entities_of_interest = None, entities_of_interest_record_set = None, distance_method, spatial_search_distance, spatial_search_distance_unit, temporal_search_distance, temporal_search_distance_unit, include_tracks_layer = false, max_trace_depth = 2147483647, attribute_match_criteria = None, attribute_constraints_expresssion = None)


Example result: {"output":<DataFrame>, "tracksLayer":<DataFrame>}

In addition to the tools listed above, a project tool is provided with the geoanalytics package that allows you to project the geometry of a DataFrame into the specified spatial reference.






project(input_features, output_coord_system)


input_features is the DataFrame to project and output_coord_system is the WKT or WKID of the spatial reference to use.

Example: geoanalytics.project(df, 2796)