Executing big data tools

The GeoAnalytics tools are presented through a set of sub modules within the geoanalytics module. To view the list of tools available, refer to the page titled Working with big data. In this page, we will learn how to execute big data tools.

Ensuring your GIS supports GeoAnalytics

Before executing a tool, we need to ensure an ArcGIS Enterprise GIS is set up with a licensed GeoAnalytics server. To do so, call the is_supported() method after connecting to your Enterprise portal. See the Components of ArcGIS URLs documentation for details on the urls to enter in the GIS parameters based on your particular Enterprise configuration.

In [ ]:
# connect to Enterprise GIS
from arcgis.gis import GIS
import arcgis.geoanalytics

gis = GIS("your_gis_portal_url", "username", "password")
In [ ]:
# check if GeoAnalytics is supported
arcgis.geoanalytics.is_supported()
Out[ ]:
True

When no parameters are specified with geoanalytics methods, they use the active GIS connection, which you can query with the arcgis.env.active_gis property. However, if you are working with more than one GIS object, you can specify the desired GIS object as the gis parameter of this method. For example, let us create a connection to an Enterprise deployment and check if GeoAnalytics is supported.

In [ ]:
ago_gis = GIS("https://geoinfo_portal.esri.com/portal", "username", "password")
arcgis.geoanalytics.is_supported(ago_gis)
Out[ ]:
False

Executing a GeoAnalytics tool

Looking for big data file share items

When you add a big data file share, a corresponding item gets created on your portal. You can search for it like any other portal Item and query its layers.

In [ ]:
search_result = gis.content.search("", item_type = "big data file share")
search_result
Out[ ]:
[<Item title:"bigDataFileShares_hdfs_test" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_qalab" type:Big Data File Share owner:sharing3>,
 <Item title:"bigDataFileShares_Sample_US_City_Crime" type:Big Data File Share owner:admin>,
 <Item title:"bigDataFileShares_FileShareFolder" type:Big Data File Share owner:admin>]
In [ ]:
data_item = search_result[3]
Out[ ]:
bigDataFileShares_FileShareFolder
Big Data File Share by admin
Last Modified: March 29, 2018
0 comments, 0 views
In [ ]:
data_item.layers
Out[ ]:
[<Layer url:"https://dev0001561.esri.com/gax/rest/services/DataStoreCatalogs/bigDataFileShares_FileShareFolder/BigDataCatalogServer/Earthquakes">,
 <Layer url:"https://dev0001561.esri.com/gax/rest/services/DataStoreCatalogs/bigDataFileShares_FileShareFolder/BigDataCatalogServer/Hurricanes">]
In [ ]:
earthquakes = data_item.layers[0]
earthquakes
Out[ ]:
<Layer url:"https://dev0001561.esri.com/gax/rest/services/DataStoreCatalogs/bigDataFileShares_FileShareFolder/BigDataCatalogServer/Earthquakes">

Executing the Aggregate Points tool

Access the aggregate_points() tool through the summarize_data module. This example uses the Aggregate Points tool to aggregate the point features representing earthquakes into 1 Kilometer square bins. The tool creates an output feature layer in your portal you can access once processing is complete.

In [ ]:
from arcgis.geoanalytics.summarize_data import aggregate_points

The GeoAnalytics Tools use a process spatial reference during execution. Analyses with square or hexagon bins require a projected coordinate system. We'll use the World Cylindrical Equal Area projection (WKID 54034) below (as it is the default used when running tools in ArcGIS Online). All results are stored in the spatiotemporal datastore of the Enterprise in the WGS 84 Spatial Reference.

See the GeoAnalytics Documentation for a full explanation of analysis environment settings.

In [ ]:
arcgis.env.process_spatial_reference=54034

The ArcGIS Platform, including the ArcGIS API for Python, manages and transforms geographic data with a large suite of tools and functions collectively known as geoprocessing. The GeoAnalytics Tools in the ArcGIS API for Python are a subset of geoprocessing tools, and operate in the context of a geoprocessing environment. You can set various aspects of this environment to control how tools are executed and what messages you receive during and after the execution. See the Logging and error handling section in the API for Python Geoprocessing Guide Advanced concepts for ways to control messaging, including the arcgis.env.verbose setting.

In [ ]:
import arcgis.env

arcgis.env.verbose=True
In [ ]:
agg_result = aggregate_points(earthquakes, bin_size=1, bin_size_unit='Kilometers', output_name='EQAgg_5mi_hex')
Submitted.
Executing...
Executing (AggregatePoints): AggregatePoints "Feature Set" # 1 Kilometers # # # # # # # "{"serviceProperties": {"name": "EQAgg_5mi_hex", "serviceUrl": "http://dev0001560.esri.com/server/rest/services/Hosted/EQAgg_5mi_hex/FeatureServer"}, "itemProperties": {"itemId": "ece87b3a74094a85929a2f998407e724"}}" "{"processSR": {"wkid": 54034}}"
Start Time: Thu Mar 29 16:05:24 2018
Using URL based GPRecordSet param: https://<your_gis_portal>/<web_adaptor>/rest/services/DataStoreCatalogs/bigDataFileShares_FileShareFolder/BigDataCatalogServer/Earthquakes
{"messageCode":"BD_101033","message":"'pointLayer' will be projected into the processing spatial reference.","params":{"paramName":"pointLayer"}}
{"messageCode":"BD_101028","message":"Starting new distributed job with 2 tasks.","params":{"totalTasks":"2"}}
{"messageCode":"BD_101029","message":"0/2 distributed tasks completed.","params":{"completedTasks":"0","totalTasks":"2"}}
{"messageCode":"BD_101029","message":"1/2 distributed tasks completed.","params":{"completedTasks":"1","totalTasks":"2"}}
{"messageCode":"BD_101029","message":"2/2 distributed tasks completed.","params":{"completedTasks":"2","totalTasks":"2"}}
{"messageCode":"BD_101081","message":"Finished writing results:"}
{"messageCode":"BD_101082","message":"* Count of features = 75614","params":{"resultCount":"75614"}}
{"messageCode":"BD_101083","message":"* Spatial extent = {\"xmin\":-179.9999999999985,\"ymin\":-72.47391474883104,\"xmax\":180,\"ymax\":87.1601840635686}","params":{"extent":"{\"xmin\":-179.9999999999985,\"ymin\":-72.47391474883104,\"xmax\":180,\"ymax\":87.1601840635686}"}}
{"messageCode":"BD_101084","message":"* Temporal extent = None","params":{"extent":"None"}}
{"messageCode":"BD_0","message":"Feature service layer created: http://<your_gis_portal>/<web_adaptor>/rest/services/Hosted/EQAgg_5mi_hex/FeatureServer/0","params":{"serviceUrl":"http://<your_gis_portal>/<web_adaptor>/rest/services/Hosted/EQAgg_5mi_hex/FeatureServer/0"}}
Succeeded at Thu Mar 29 16:06:13 2018 (Elapsed Time: 48.17 seconds)

The aggregate points tool, just like may other GeoAnalytics tools returns a feature layer item which contains the processed results.


Feedback on this topic?