Introduction to the Spatially Enabled DataFrame

The Spatially Enabled DataFrame (SEDF) creates a simple, intutive object that can easily manipulate geometric and attribute data.

New at version 1.5, the Spatially Enabled DataFrame is an evolution of the SpatialDataFrame object that you may be familiar with. While the SDF object is still avialable for use, the team has stopped active development of it and is promoting the use of this new Spatially Enabled DataFrame pattern. The SEDF provides you better memory management, ability to handle larger datasets and is the pattern that Pandas advocates as the path forward.

The Spatially Enabled DataFrame inserts a custom namespace called spatial into the popular Pandas DataFrame structure to give it spatial abilities. This allows you to use intutive, pandorable operations on both the attribute and spatial columns. Thus, the SEDF is based on data structures inherently suited to data analysis, with natural operations for the filtering and inspecting of subsets of values which are fundamental to statistical and geographic manipulations.

The dataframe reads from many sources, including shapefiles, Pandas DataFrames, feature classes, GeoJSON, and Feature Layers.

This document outlines some fundamentals of using the Spatially Enabled DataFrame object for working with GIS data.

import pandas as pd
from arcgis.features import GeoAccessor, GeoSeriesAccessor

Accessing GIS data

GIS users need to work with both published layers on remote servers (web layers) and local data, but the ability to manipulate these datasets without permanently copying the data is lacking. The Spatial Enabled DataFrame solves this problem because it is an in-memory object that can read, write and manipulate geospatial data.

The SEDF integrates with Esri's ArcPy site-package as well as the open source pyshp, shapely and fiona packages. This means the ArcGIS API for Python SEDF can use either of these geometry engines to provide you options for easily working with geospatial data regardless of your platform. The SEDF transforms data into the formats you desire so you can use Python functionality to analyze and visualize geographic information.

Data can be read and scripted to automate workflows and just as easily visualized on maps in Jupyter notebooks. The SEDF can export data as feature classes or publish them directly to servers for sharing according to your needs. Let's explore some of the different options available with the versatile Spatial Enabled DataFrame namespaces:

Reading Web Layers

Feature layers hosted on ArcGIS Online or ArcGIS Enterprise can be easily read into a Spatially Enabled DataFrame using the from_layer method. Once you read it into a SEDF object, you can create reports, manipulate the data, or convert it to a form that is comfortable and makes sense for its intended purpose.

Example: Retrieving an ArcGIS Online item and using the layers property to inspect the first 5 records of the layer

from arcgis import GIS
gis = GIS()
item = gis.content.get("85d0ca4ea1ca4b9abf0c51b9bd34de2e")
flayer = item.layers[0]

# create a Spatially Enabled DataFrame object
sdf = pd.DataFrame.spatial.from_layer(flayer)
sdf.head()
AGE_10_14AGE_15_19AGE_20_24AGE_25_34AGE_35_44AGE_45_54AGE_55_64AGE_5_9AGE_65_74AGE_75_84...PLACEFIPSPOP2010POPULATIONPOP_CLASSRENTER_OCCSHAPESTSTFIPSVACANTWHITE
02144231420023531388756436353206757992850...0408220395404034666563{"x": -12751215.004681978, "y": 4180278.406256...AZ04670332367
187686757412471560212223427332157975...0424895143641484761397{"x": -12755627.731115643, "y": 4164465.572856...AZ04138912730
2100010038332311206323743631106861653776...0425030262652697761963{"x": -12734674.294574209, "y": 3850472.723091...AZ04963622995
32730285021944674524074388440249981454608...0439370525275504176765{"x": -12725332.21151233, "y": 4096532.0908223...AZ04915947335
427322965202431823512310916322497916467...0463470255052976761681{"x": -12770984.257542243, "y": 3826624.133935...AZ0457216120

5 rows × 51 columns

When you inspect the type of the object, you get back a standard pandas DataFrame object. However, this object now has an additional SHAPE column that allows you to perform geometric operations. In other words, this DataFrame is now geo-aware.

type(sdf)
pandas.core.frame.DataFrame

Further, the DataFrame has a new spatial property that provides a list of geoprocessing operations that can be performed on the object. The rest of the guides in this section go into details of how to use these functionalities. So, sit tight.

Reading Feature Layer Data

As seen above, the SEDF can consume a Feature Layer served from either ArcGIS Online or ArcGIS Enterprise orgs. Let's take a step-by-step approach to break down the notebook cell above and then extract a subset of records from the feature layer.

Example: Examining Feature Layer content

Use the from_layer method on the SEDF to instantiate a data frame from an item's layer and inspect the first 5 records.

# Retrieve an item from ArcGIS Online from a known ID value
known_item = gis.content.get("85d0ca4ea1ca4b9abf0c51b9bd34de2e")
known_item
USA Major Cities
This layer presents the locations of cities within the United States with populations of approximately 10,000 or greater, all state capitals, and the national capital.Feature Layer Collection by esri_dm
Last Modified: August 22, 2019
0 comments, 1,839,428 views
# Obtain the first feature layer from the item
fl = known_item.layers[0]

# Use the `from_layer` static method in the 'spatial' namespace on the Pandas' DataFrame
sdf = pd.DataFrame.spatial.from_layer(fl)

# Return the first 5 records. 
sdf.head()
AGE_10_14AGE_15_19AGE_20_24AGE_25_34AGE_35_44AGE_45_54AGE_55_64AGE_5_9AGE_65_74AGE_75_84...PLACEFIPSPOP2010POPULATIONPOP_CLASSRENTER_OCCSHAPESTSTFIPSVACANTWHITE
02144231420023531388756436353206757992850...0408220395404034666563{"x": -12751215.004681978, "y": 4180278.406256...AZ04670332367
187686757412471560212223427332157975...0424895143641484761397{"x": -12755627.731115643, "y": 4164465.572856...AZ04138912730
2100010038332311206323743631106861653776...0425030262652697761963{"x": -12734674.294574209, "y": 3850472.723091...AZ04963622995
32730285021944674524074388440249981454608...0439370525275504176765{"x": -12725332.21151233, "y": 4096532.0908223...AZ04915947335
427322965202431823512310916322497916467...0463470255052976761681{"x": -12770984.257542243, "y": 3826624.133935...AZ0457216120

5 rows × 51 columns

NOTE: See Pandas DataFrame head() method documentation for details.

You can also use sql queries to return a subset of records by leveraging the ArcGIS API for Python's Feature Layer object itself. When you run a query() on a FeatureLayer, you get back a FeatureSet object. Calling the sdf property of the FeatureSet returns a Spatially Enabled DataFrame object. We then use the data frame's head() method to return the first 5 records and a subset of columns from the DataFrame:

Example: Feature Layer Query Results to a Spatially Enabled DataFrame

We'll use the AGE_45_54 column to query the data frame and return a new DataFrame with a subset of records. We can use the built-in zip() function to print the data frame attribute field names, and then use data frame syntax to view specific attribute fields in the output:

# Filter feature layer records with a sql query. 
# See https://developers.arcgis.com/rest/services-reference/query-feature-service-layer-.htm

df = fl.query(where="AGE_45_54 < 1500").sdf
for a,b,c,d in zip(df.columns[::4], df.columns[1::4],df.columns[2::4], df.columns[3::4]):
    print("{:<30}{:<30}{:<30}{:<}".format(a,b,c,d))
AGE_10_14                     AGE_15_19                     AGE_20_24                     AGE_25_34
AGE_35_44                     AGE_45_54                     AGE_55_64                     AGE_5_9
AGE_65_74                     AGE_75_84                     AGE_85_UP                     AGE_UNDER5
AMERI_ES                      ASIAN                         AVE_FAM_SZ                    AVE_HH_SZ
BLACK                         CAPITAL                       CLASS                         FAMILIES
FEMALES                       FHH_CHILD                     FID                           HAWN_PI
HISPANIC                      HOUSEHOLDS                    HSEHLD_1_F                    HSEHLD_1_M
HSE_UNITS                     MALES                         MARHH_CHD                     MARHH_NO_C
MED_AGE                       MED_AGE_F                     MED_AGE_M                     MHH_CHILD
MULT_RACE                     NAME                          OBJECTID                      OTHER
OWNER_OCC                     PLACEFIPS                     POP2010                       POPULATION
POP_CLASS                     RENTER_OCC                    SHAPE                         ST
# Return a subset of columns on just the first 5 records
df[['NAME', 'AGE_45_54', 'POP2010']].head()
NAMEAGE_45_54POP2010
0Somerton141114287
1Anderson13339932
2Camp Pendleton South12710616
3Citrus144310866
4Commerce147812823

Accessing local GIS data

The SEDF can also access local geospatial data. Depending upon what Python modules you have installed, you'll have access to a wide range of functionality:

Example: Reading a Shapefile

You must authenticate to ArcGIS Online or ArcGIS Enterprise to use the from_featureclass() method to read a shapefile with a Python interpreter that does not have access to ArcPy.

g2 = GIS("https://www.arcgis.com", "username", "password")

g2 = GIS("https://pythonapi.playground.esri.com/portal", "arcgis_python", "amazing_arcgis_123")
sdf = pd.DataFrame.spatial.from_featureclass("path\to\your\data\census_example\cities.shp")
sdf.tail()
FIDNAMECLASSSTSTFIPSPLACEFIPCAPITALAREALANDAREAWATERPOP_CLASS...MARHH_NO_CMHH_CHILDFHH_CHILDFAMILIESAVE_FAM_SZHSE_UNITSVACANTOWNER_OCCRENTER_OCCSHAPE
35523552East ProvidenceCityRI442296013.4053.2086...56583061414128502.9921309779120968434{'x': -71.3608270663031, 'y': 41.8015001782688...
35533553PawtucketCityRI44546408.7360.2597...67407543242185203.073181917721333116716{'x': -71.3759815680945, 'y': 41.8755001649055...
35543554Fall RiverCityMA252300031.0227.2027...90117594247235583.004185730981352125238{'x': -71.1469910908576, 'y': 41.6981001567767...
35553555SomersetCensus Designated PlaceMA25624658.1093.8676...27719128752602.98714315657231264{'x': -71.15319106847441, 'y': 41.748500174901...
35563556New BedfordCityMA254500020.1223.9047...88139104701240833.014151133331671121467{'x': -70.93370908847608, 'y': 41.651800155406...

5 rows × 48 columns

Example: Reading a Featureclass from FileGDB

You must have fiona installed if you use the from_featureclass() method to read a feature class from FileGDB with a Python interpreter that does not have access to ArcPy.

sdf = pd.DataFrame.spatial.from_featureclass("path\to\your\data\census_example\census.gdb\cities")
sdf.head()
OBJECTIDFIDNAMECLASSSTSTFIPSPLACEFIPCAPITALAREALANDAREAWATER...MARHH_NO_CMHH_CHILDFHH_CHILDFAMILIESAVE_FAM_SZHSE_UNITSVACANTOWNER_OCCRENTER_OCCSHAPE
010CollegeCensus Designated PlaceAK021675018.6700.407...93615233926403.13450139723951709{'x': -147.82719115699996, 'y': 64.84830019400...
121FairbanksCityAK022423031.8570.815...2259395105871873.1512357128238637212{'x': -147.72638162999996, 'y': 64.83809069700...
232KalispellCityMT30400755.4580.004...143314748034942.92653239034582684{'x': -114.31606412399998, 'y': 48.19780017900...
343Post FallsCityID16648109.6560.045...185120546746703.13669732846111758{'x': -116.93792709799999, 'y': 47.71555468000...
454DishmanCensus Designated PlaceWA53179853.3780.000...109613134525642.96440825726351516{'x': -117.27780913799995, 'y': 47.65654568400...

5 rows × 49 columns

Saving Spatially Enabled DataFrames

The SEDF can export data to various data formats for use in other applications.

Export Options

Export to Feature Class

The SEDF allows for the export of whole datasets or partial datasets.

Example: Export a whole dataset to a shapefile:

sdf.spatial.to_featureclass(location=r"c:\output_examples\census.shp")
'c:\\output_examples\\census.shp'

The ArcGIS API for Python installs on all macOS and Linux machines, as well as those Windows machines not using Python interpreters that have access to ArcPy will only be able to write out to shapefile format with the to_featureclass method. Writing to file geodatabases requires the ArcPy site-package.

Example: Export dataset with a subset of columns and top 5 records to a shapefile:

for a,b,c,d in zip(sdf.columns[::4], sdf.columns[1::4], sdf.columns[2::4], sdf.columns[3::4]):
    print("{:<30}{:<30}{:<30}{:<}".format(a,b,c,d))
PLACENS                       GEOID                         NAMELSAD                      CLASSFP
FUNCSTAT                      ALAND                         AWATER                        INTPTLAT
columns = ['NAME', 'ST', 'CAPITAL', 'STFIPS', 'POP2000', 'POP2007', 'SHAPE']
sdf[columns].head().spatial.to_featureclass(location=r"/path/to/your/data/directory/sdf_head_output.shp")
'/path/to/your/data/directory/sdf_head_output.shp'

Example: Export dataset to a featureclass in FileGDB:

sdf.spatial.to_featureclass(location=r"c:\output_examples\census.gdb\cities");

Publish as a Feature Layer

The SEDF allows for the publishing of datasets as feature layers.

Example: Publishing as a feature layer:

lyr = sdf.spatial.to_featurelayer('census_cities', folder='census')
lyr
census_cities
Feature Layer Collection by api_data_owner
Last Modified: September 11, 2019
0 comments, 0 views

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.