ArcGIS Developers
Dashboard

ArcGIS API for Python

Download the samples Try it live

Creating hurricane tracks using Geoanalytics

The sample code below uses big data analytics (GeoAnalytics) to reconstruct hurricane tracks using data registered on a big data file share in the GIS. Note that this functionality is currently available on ArcGIS Enterprise 10.5 and not yet with ArcGIS Online.

Reconstruct tracks

Reconstruct tracks is a type of data aggregation tool available in the arcgis.geoanalytics module. This tool works with a layer of point features or polygon features that are time enabled. It first determines which points belong to a track using an identification number or identification string. Using the time at each location, the tracks are ordered sequentially and transformed into a line representing the path of movement.

Data used

For this sample, hurricane data from over a period of 50 years, totalling about 150,000 points split into 5 shape files was used. The National Hurricane Center provides similar datasets that can be used for exploratory purposes.

To illustrate the nature of the data a subset was published as a feature service and can be visualized as below:

In [10]:
from arcgis.gis import GIS

#Let us connect to an ArcGIS Enterprise
gis = GIS('https://pythonapi.playground.esri.com/portal', 'arcgis_python', 'amazing_arcgis_123')
hurricane_pts = gis.content.get('ebdb876ca1a74cc89a81c3f8ee481e94')
hurricane_pts
Out[10]:
Hurricane_tracks_points
Hurricane_tracks_pointsFeature Layer Collection by arcgis_python
Last Modified: May 27, 2021
0 comments, 4 views
In [1]:
subset_map = gis.map("USA")
subset_map
Out[1]:
In [3]:
subset_map.add_layer(hurricane_pts)

Inspect the data attributes

Let us query the first layer in hurricane_pts and view its attribute table as a Pandas dataframe.

In [14]:
hurricane_pts.layers[0].query(as_df=True).head()
Out[14]:
serial_num season num basin sub_basin name iso_time nature latitude longitude ... center wind_wmo1 pres_wmo1 track_type size Wind INSTANT_DATETIME globalid OBJECTID SHAPE
0 1927265N10325 1927 3 NA MM NOT NAMED 9/26/1927 0:00 TS 16.8 -43.6 ... atcf 32.263 -100.000 main 35000 35000 1927-09-26 00:00:00 {2360A7E6-07CE-C8E9-B05C-1BE4CC0466AC} 1 {"x": -43.6, "y": 16.8, "spatialReference": {"...
1 1978347S20041 1979 2 SI MM 02S:ANGELE 12/24/1978 12:00 NR -22.0 36.4 ... reunion -100.000 -100.000 main 0 0 1978-12-24 12:00:00 {5263FA06-0416-E055-F3A2-7CF1F9F3CD21} 2 {"x": 36.4, "y": -22, "spatialReference": {"wk...
2 1994362S11054 1995 4 SI MM CHRISTELLE 12/31/1994 0:00 TS -13.8 51.9 ... reunion 15.938 16.458 main 25000 25000 1994-12-31 00:00:00 {357134CE-40B1-75F9-C82C-5DE32098E161} 3 {"x": 51.9, "y": -13.8, "spatialReference": {"...
3 2006093S15115 2006 14 SI WA HUBERT 4/3/2006 12:00 NR -14.4 114.4 ... bom 8.490 23.063 main 25300 25300 2006-04-03 12:00:00 {4BA29993-3C4B-831A-AC79-4E940FD4D00D} 4 {"x": 114.4, "y": -14.4, "spatialReference": {...
4 1951009S16140 1951 3 SP EA 09P 1/23/1951 23:00 NR -21.9 143.0 ... bom -100.000 4.181 main 0 0 1951-01-23 23:00:00 {4BB38CC9-7C65-EDEB-9CE8-61CA9D948743} 5 {"x": 143, "y": -21.9, "spatialReference": {"w...

5 rows × 22 columns

Create a data store

For the GeoAnalytics server to process your big data, it needs the data to be registered as a data store. In our case, the data is in multiple shape files and we will register the folder containing the files as a data store of type bigDataFileShare.

Let us connect to an ArcGIS Enterprise

In [4]:
gis = GIS('https://pythonapi.playground.esri.com/portal', 'arcgis_python', 'amazing_arcgis_123')

Get the geoanalytics datastores and search it for the registered datasets:

In [5]:
# Query the data stores available
import arcgis
datastores = arcgis.geoanalytics.get_datastores()

bigdata_fileshares = datastores.search(id='a215eebc-1bab-42d5-9aa0-45fe2549ba55')
bigdata_fileshares
Out[5]:
[<Datastore title:"/bigDataFileShares/NYC_taxi_data15" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/GA_Data" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/ServiceCallsOrleansTest" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/calls" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/all_hurricanes" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/NYCdata" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/hurricanes_1848_1900" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/ServiceCallsOrleans" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/hurricanes_dask_csv" type:"bigDataFileShare">,
 <Datastore title:"/bigDataFileShares/hurricanes_dask_shp" type:"bigDataFileShare">,
 <Datastore title:"/cloudStores/cloud_store" type:"cloudStore">]

The dataset hurricanes_all data is registered as a big data file share with the Geoanalytics datastore, so we can reference it:

In [6]:
data_item = bigdata_fileshares[0]

If there is no big data file share for hurricane track data registered on the server, we can register one that points to the shared folder containing the shape files.

In [17]:
# data_item = datastores.add_bigdata("Hurricane_tracks", r"\\path_to_hurricane_data")
Big Data file share exists for Hurricane_tracks

Once a big data file share is registered, the GeoAnalytics server processes all the valid file types to discern the schema of the data, including information about the geometry in a dataset. If the dataset is time-enabled, as is required to use some GeoAnalytics Tools, the manifest reports the necessary metadata about how time information is stored as well.

This process can take a few minutes depending on the size of your data. Once processed, querying the manifest property returns the schema. As you can see from below, the schema is similar to the subset we observed earlier in this sample.

In [23]:
data_item.manifest['datasets'][0] #for brevity only a portion is printed
Out[23]:
{'name': 'hurricanes',
 'format': {'type': 'shapefile', 'extension': 'shp'},
 'schema': {'fields': [{'name': 'serial_num', 'type': 'esriFieldTypeString'},
   {'name': 'season', 'type': 'esriFieldTypeBigInteger'},
   {'name': 'num', 'type': 'esriFieldTypeBigInteger'},
   {'name': 'basin', 'type': 'esriFieldTypeString'},
   {'name': 'sub_basin', 'type': 'esriFieldTypeString'},
   {'name': 'name', 'type': 'esriFieldTypeString'},
   {'name': 'iso_time', 'type': 'esriFieldTypeString'},
   {'name': 'nature', 'type': 'esriFieldTypeString'},
   {'name': 'latitude', 'type': 'esriFieldTypeDouble'},
   {'name': 'longitude', 'type': 'esriFieldTypeDouble'},
   {'name': 'wind_wmo_', 'type': 'esriFieldTypeDouble'},
   {'name': 'pres_wmo_', 'type': 'esriFieldTypeBigInteger'},
   {'name': 'center', 'type': 'esriFieldTypeString'},
   {'name': 'wind_wmo1', 'type': 'esriFieldTypeDouble'},
   {'name': 'pres_wmo1', 'type': 'esriFieldTypeDouble'},
   {'name': 'track_type', 'type': 'esriFieldTypeString'},
   {'name': 'size', 'type': 'esriFieldTypeString'},
   {'name': 'Wind', 'type': 'esriFieldTypeBigInteger'}]},
 'geometry': {'geometryType': 'esriGeometryPoint',
  'spatialReference': {'wkid': 102682, 'latestWkid': 3452}},
 'time': {'timeType': 'instant',
  'timeReference': {'timeZone': 'UTC'},
  'fields': [{'name': 'iso_time',
    'formats': ['yyyy-MM-dd HH:mm:ss', 'MM/dd/yyyy HH:mm']}]}}

Perform data aggregation using reconstruct tracks tool

When you add a big data file share, a corresponding item gets created in your GIS. You can search for it like a regular item and query its layers.

In [7]:
search_result = gis.content.search("bigDataFileShares_hurricanes_all", item_type = "big data file share")
search_result
Out[7]:
[<Item title:"bigDataFileShares_all_hurricanes" type:Big Data File Share owner:api_data_owner>]
In [8]:
data_item = search_result[0]
data_item
Out[8]:
bigDataFileShares_all_hurricanes
Big Data File Share by api_data_owner
Last Modified: May 02, 2018
0 comments, 0 views
In [9]:
years_50 = data_item.layers[0]
years_50
Out[9]:
<Layer url:"https://pythonapi.playground.esri.com/ga/rest/services/DataStoreCatalogs/bigDataFileShares_all_hurricanes/BigDataCatalogServer/hurricanes">

Reconstruct tracks tool

The reconstruct_tracks() function is available in the arcgis.geoanalytics.summarize_data module. In this example, we are using this tool to aggregate the numerous points into line segments showing the tracks followed by the hurricanes. The tool creates a feature layer item as an output which can be accessed once the processing is complete.

In [12]:
from arcgis.geoanalytics.summarize_data import reconstruct_tracks
from datetime import datetime as dt
In [20]:
agg_result = reconstruct_tracks(years_50, 
                                track_fields='Serial_Num',
                                output_name='construct tracks test' + str(dt.now().microsecond))
{"messageCode":"BD_101024","message":"Using geodesic method for geographic coordinate system."}

Inspect the results

Let us create a map and load the processed result which is a feature service

In [2]:
processed_map = gis.map("USA")
processed_map
Out[2]: