Skip To Content ArcGIS for Developers Sign In Dashboard

ArcGIS API for Python

Introduction to the Spatially Enabled DataFrame

The Spatially Enabled DataFrame (SEDF) creates a simple, intutive object that can easily manipulate geometric and attribute data.

New at version 1.5, the Spatially Enabled DataFrame is an evolution of the SpatialDataFrame object that you may be familiar with. While the SDF object is still avialable for use, the team has stopped active development of it and is promoting the use of this new Spatially Enabled DataFrame pattern. The SEDF provides you better memory management, ability to handle larger datasets and is the pattern that Pandas advocates as the path forward.

The Spatially Enabled DataFrame inserts a custom namespace called spatial into the popular Pandas DataFrame structure to give it spatial abilities. This allows you to use intutive, pandorable operations on both the attribute and spatial columns. Thus, the SEDF is based on data structures inherently suited to data analysis, with natural operations for the filtering and inspecting of subsets of values which are fundamental to statistical and geographic manipulations.

The dataframe reads from many sources, including shapefiles, Pandas DataFrames, feature classes, GeoJSON, and Feature Layers.

In [7]:
import pandas as pd
from arcgis.features import GeoAccessor, GeoSeriesAccessor

Accessing GIS data

GIS users need to work with both published layers on remote servers (web layers) and local data, but the ability to manipulate these datasets without permanentently copying the data is lacking. The Spatial Enabled DataFrame solves this problem because it is an in-memory object that can read, write and manipulate geospatial data.

The SEDF integrates with Esri's ArcPy site-package as well as the open source pyshp, shapely and fiona packages. This means the ArcGIS API for Python SEDF can use either of these geometry engines to provide you options for easily working with geospatial data regardless of your platform. The SEDF transforms data into the formats you desire so you can use Python functionality to analyze and visualize geographic information.

Data can be read and scripted to automate workflows and just as easily visualized on maps in Jupyter notebooks. The SEDF can export data as feature classes or publish them directly to servers for sharing according to your needs. Let's explore some of the different options available with the versatile Spatial Enabled DataFrame namespaces:

Reading Web Layers

Feature layers hosted on ArcGIS Online or ArcGIS Enterprise can be easily read into a Spatially Enabled DataFrame using the from_layer method. Once you read it into a SEDF object, you can create reports, manipulate the data, or convert it to a form that is comfortable and makes sense for its intended purpose.

Example: Retrieving an ArcGIS Online item and using the layers property to inspect the first 5 records of the layer

In [6]:
from arcgis import GIS
gis = GIS()
item = gis.content.get("85d0ca4ea1ca4b9abf0c51b9bd34de2e")
flayer = item.layers[0]

# create a Spatially Enabled DataFrame object
sdf = pd.DataFrame.spatial.from_layer(flayer)
sdf.head()
Out[6]:
AGE_10_14 AGE_15_19 AGE_20_24 AGE_25_34 AGE_35_44 AGE_45_54 AGE_55_64 AGE_5_9 AGE_65_74 AGE_75_84 ... PLACEFIPS POP2010 POPULATION POP_CLASS RENTER_OCC SHAPE ST STFIPS VACANT WHITE
0 2144 2314 2002 3531 3887 5643 6353 2067 5799 2850 ... 0408220 39540 40346 6 6563 {"x": -12751215.004681978, "y": 4180278.406256... AZ 04 6703 32367
1 876 867 574 1247 1560 2122 2342 733 2157 975 ... 0424895 14364 14847 6 1397 {"x": -12755627.731115643, "y": 4164465.572856... AZ 04 1389 12730
2 1000 1003 833 2311 2063 2374 3631 1068 6165 3776 ... 0425030 26265 26977 6 1963 {"x": -12734674.294574209, "y": 3850472.723091... AZ 04 9636 22995
3 2730 2850 2194 4674 5240 7438 8440 2499 8145 4608 ... 0439370 52527 55041 7 6765 {"x": -12725332.21151233, "y": 4096532.0908223... AZ 04 9159 47335
4 2732 2965 2024 3182 3512 3109 1632 2497 916 467 ... 0463470 25505 29767 6 1681 {"x": -12770984.257542243, "y": 3826624.133935... AZ 04 572 16120

5 rows × 51 columns

When you inspect the type of the object, you get back a standard pandas DataFrame object. However, this object now has an additional SHAPE column that allows you to perform geometric operations. In other words, this DataFrame is now geo-aware.

In [7]:
type(sdf)
Out[7]:
pandas.core.frame.DataFrame

Further, the DataFrame has a new spatial property that provides a list of geoprocessing operations that can be performed on the object. The rest of the guides in this section go into details of how to use these functionalities. So, sit tight.

Reading Feature Layer Data

As seen above, the SEDF can consume a Feature Layer served from either ArcGIS Online or ArcGIS Enterprise orgs. Let's take a step-by-step approach to break down the notebook cell above and then extract a subset of records from the feature layer.

Example: Examining Feature Layer content

Use the from_layer method on the SEDF to instantiate a data frame from an item's layer and inspect the first 5 records.

In [17]:
# Retrieve an item from ArcGIS Online from a known ID value
known_item = gis.content.get("85d0ca4ea1ca4b9abf0c51b9bd34de2e")
known_item
Out[17]:
USA Major Cities
This layer presents the locations of cities within the United States with populations of approximately 10,000 or greater, all state capitals, and the national capital.Feature Layer Collection by esri_dm
Last Modified: August 22, 2019
0 comments, 1,839,428 views
In [10]:
# Obtain the first feature layer from the item
fl = known_item.layers[0]

# Use the `from_layer` static method in the 'spatial' namespace on the Pandas' DataFrame
sdf = pd.DataFrame.spatial.from_layer(fl)

# Return the first 5 records. 
sdf.head()
Out[10]:
AGE_10_14 AGE_15_19 AGE_20_24 AGE_25_34 AGE_35_44 AGE_45_54 AGE_55_64 AGE_5_9 AGE_65_74 AGE_75_84 ... PLACEFIPS POP2010 POPULATION POP_CLASS RENTER_OCC SHAPE ST STFIPS VACANT WHITE
0 2144 2314 2002 3531 3887 5643 6353 2067 5799 2850 ... 0408220 39540 40346 6 6563 {"x": -12751215.004681978, "y": 4180278.406256... AZ 04 6703 32367
1 876 867 574 1247 1560 2122 2342 733 2157 975 ... 0424895 14364 14847 6 1397 {"x": -12755627.731115643, "y": 4164465.572856... AZ 04 1389 12730
2 1000 1003 833 2311 2063 2374 3631 1068 6165 3776 ... 0425030 26265 26977 6 1963 {"x": -12734674.294574209, "y": 3850472.723091... AZ 04 9636 22995
3 2730 2850 2194 4674 5240 7438 8440 2499 8145 4608 ... 0439370 52527 55041 7 6765 {"x": -12725332.21151233, "y": 4096532.0908223... AZ 04 9159 47335
4 2732 2965 2024 3182 3512 3109 1632 2497 916 467 ... 0463470 25505 29767 6 1681 {"x": -12770984.257542243, "y": 3826624.133935... AZ 04 572 16120

5 rows × 51 columns

NOTE: See Pandas DataFrame head() method documentation for details.

You can also use sql queries to return a subset of records by leveraging the ArcGIS API for Python's Feature Layer object itself. When you run a query() on a FeatureLayer, you get back a FeatureSet object. Calling the sdf property of the FeatureSet returns a Spatially Enabled DataFrame object. We then use the data frame's head() method to return the first 5 records and a subset of columns from the DataFrame:

Example: Feature Layer Query Results to a Spatially Enabled DataFrame

We'll use the AGE_45_54 column to query the dataframe and return a new DataFrame with a subset of records. We can use the built-in zip() function to print the data frame attribute field names, and then use data frame syntax to view specific attribute fields in the output:

In [5]:
# Filter feature layer records with a sql query. 
# See https://developers.arcgis.com/rest/services-reference/query-feature-service-layer-.htm

df = fl.query(where="AGE_45_54 < 1500").sdf
In [6]:
for a,b,c,d in zip(df.columns[::4], df.columns[1::4],df.columns[2::4], df.columns[3::4]):
    print("{:<30}{:<30}{:<30}{:<}".format(a,b,c,d))
AGE_10_14                     AGE_15_19                     AGE_20_24                     AGE_25_34
AGE_35_44                     AGE_45_54                     AGE_55_64                     AGE_5_9
AGE_65_74                     AGE_75_84                     AGE_85_UP                     AGE_UNDER5
AMERI_ES                      ASIAN                         AVE_FAM_SZ                    AVE_HH_SZ
BLACK                         CAPITAL                       CLASS                         FAMILIES
FEMALES                       FHH_CHILD                     FID                           HAWN_PI
HISPANIC                      HOUSEHOLDS                    HSEHLD_1_F                    HSEHLD_1_M
HSE_UNITS                     MALES                         MARHH_CHD                     MARHH_NO_C
MED_AGE                       MED_AGE_F                     MED_AGE_M                     MHH_CHILD
MULT_RACE                     NAME                          OBJECTID                      OTHER
OWNER_OCC                     PLACEFIPS                     POP2010                       POPULATION
POP_CLASS                     RENTER_OCC                    SHAPE                         ST
In [7]:
# Return a subset of columns on just the first 5 records
df[['NAME', 'AGE_45_54', 'POP2010']].head()
Out[7]:
NAME AGE_45_54 POP2010
0 Somerton 1411 14287
1 Anderson 1333 9932
2 Camp Pendleton South 127 10616
3 Citrus 1443 10866
4 Commerce 1478 12823

Accessing local GIS data

The SEDF can also access local geospatial data. Depending upon what Python modules you have installed, you'll have access to a wide range of functionality:

Example: Reading a Shapefile

You must authenticate to ArcGIS Online or ArcGIS Enterprise to use the from_featureclass() method to read a shapefile with a Python interpreter that does not have access to ArcPy.

g2 = GIS("https://www.arcgis.com", "username", "password")

In [4]:
g2 = GIS("https://python.playground.esri.com/portal", "arcgis_python", "amazing_arcgis_123")
In [8]:
sdf = pd.DataFrame.spatial.from_featureclass("path\to\your\data\census_example\cities.shp")
sdf.tail()
Out[8]:
FID NAME CLASS ST STFIPS PLACEFIP CAPITAL AREALAND AREAWATER POP_CLASS ... MARHH_NO_C MHH_CHILD FHH_CHILD FAMILIES AVE_FAM_SZ HSE_UNITS VACANT OWNER_OCC RENTER_OCC SHAPE
3552 3552 East Providence City RI 44 22960 13.405 3.208 6 ... 5658 306 1414 12850 2.99 21309 779 12096 8434 {'x': -71.3608270663031, 'y': 41.8015001782688...
3553 3553 Pawtucket City RI 44 54640 8.736 0.259 7 ... 6740 754 3242 18520 3.07 31819 1772 13331 16716 {'x': -71.3759815680945, 'y': 41.8755001649055...
3554 3554 Fall River City MA 25 23000 31.022 7.202 7 ... 9011 759 4247 23558 3.00 41857 3098 13521 25238 {'x': -71.1469910908576, 'y': 41.6981001567767...
3555 3555 Somerset Census Designated Place MA 25 62465 8.109 3.867 6 ... 2771 91 287 5260 2.98 7143 156 5723 1264 {'x': -71.15319106847441, 'y': 41.748500174901...
3556 3556 New Bedford City MA 25 45000 20.122 3.904 7 ... 8813 910 4701 24083 3.01 41511 3333 16711 21467 {'x': -70.93370908847608, 'y': 41.651800155406...

5 rows × 48 columns

Example: Reading a Featureclass from FileGDB

You must have fiona installed if you use the from_featureclass() method to read a featureclass from FileGDB with a Python interpreter that does not have access to ArcPy.

In [3]:
sdf = pd.DataFrame.spatial.from_featureclass("path\to\your\data\census_example\census.gdb\cities")
sdf.head()
Out[3]:
OBJECTID FID NAME CLASS ST STFIPS PLACEFIP CAPITAL AREALAND AREAWATER ... MARHH_NO_C MHH_CHILD FHH_CHILD FAMILIES AVE_FAM_SZ HSE_UNITS VACANT OWNER_OCC RENTER_OCC SHAPE
0 1 0 College Census Designated Place AK 02 16750 18.670 0.407 ... 936 152 339 2640 3.13 4501 397 2395 1709 {'x': -147.82719115699996, 'y': 64.84830019400...
1 2 1 Fairbanks City AK 02 24230 31.857 0.815 ... 2259 395 1058 7187 3.15 12357 1282 3863 7212 {'x': -147.72638162999996, 'y': 64.83809069700...
2 3 2 Kalispell City MT 30 40075 5.458 0.004 ... 1433 147 480 3494 2.92 6532 390 3458 2684 {'x': -114.31606412399998, 'y': 48.19780017900...
3 4 3 Post Falls City ID 16 64810 9.656 0.045 ... 1851 205 467 4670 3.13 6697 328 4611 1758 {'x': -116.93792709799999, 'y': 47.71555468000...
4 5 4 Dishman Census Designated Place WA 53 17985 3.378 0.000 ... 1096 131 345 2564 2.96 4408 257 2635 1516 {'x': -117.27780913799995, 'y': 47.65654568400...

5 rows × 49 columns

Saving Spatially Enabled DataFrames

The SEDF can export data to various data formats for use in other applications.

Export Options

Export to Feature Class

The SEDF allows for the export of whole datasets or partial datasets.

Example: Export a whole dataset to a shapefile:

In [18]:
sdf.spatial.to_featureclass(location=r"c:\output_examples\census.shp")
Out[18]:
'c:\\output_examples\\census.shp'

The ArcGIS API for Python installs on all macOS and Linux machines, as well as those Windows machines not using Python interpreters that have access to ArcPy will only be able to write out to shapefile format with the to_featureclass method. Writing to file geodatabases requires the ArcPy site-package.

Example: Export dataset with a subset of columns and top 5 records to a shapefile:

In [17]:
for a,b,c,d in zip(sdf.columns[::4], sdf.columns[1::4], sdf.columns[2::4], sdf.columns[3::4]):
    print("{:<30}{:<30}{:<30}{:<}".format(a,b,c,d))
PLACENS                       GEOID                         NAMELSAD                      CLASSFP
FUNCSTAT                      ALAND                         AWATER                        INTPTLAT
In [15]:
columns = ['NAME', 'ST', 'CAPITAL', 'STFIPS', 'POP2000', 'POP2007', 'SHAPE']
sdf[columns].head().spatial.to_featureclass(location=r"/path/to/your/data/directory/sdf_head_output.shp")
Out[15]:
'/path/to/your/data/directory/sdf_head_output.shp'

Example: Export dataset to a featureclass in FileGDB:

In [23]:
sdf.spatial.to_featureclass(location=r"c:\output_examples\census.gdb\cities");

Publish as a Feature Layer

The SEDF allows for the publishing of datasets as feature layers.

Example: Publishing as a feature layer:

In [7]:
lyr = sdf.spatial.to_featurelayer('census_cities', folder='census')
lyr
Out[7]:
census_cities
Feature Layer Collection by api_data_owner
Last Modified: September 11, 2019
0 comments, 0 views
In [ ]:
 

Feedback on this topic?