The SpatialDataFrame is deprecated as of version 1.5: Please use the Spatially Enabled DataFrame instead. See this guide for more information.

Introduction to the Spatial DataFrame

The Spatial Dataframe (SDF) creates a simple, intutive object that can easily manipulate geometric and attribute data. The Spatial DataFrame extends the popular Pandas DataFrame structure with spatial abilities, allowing you to use intutive, pandorable operations on both the attribute and spatial columns. Thus the SDF is based on data structures inherently suited to data analysis, with natural operations for the filtering and inspecting of subsets of values which are fundamental to statistical and geographic manipulations.

The dataframe reads from many sources, including shapefiles, Pandas DataFrames, feature classes, GeoJSON, and Feature Layers.

This document outlines some fundamentals of using the SpatialDataFrame object for working with GIS data.

In [1]:
from arcgis.features import SpatialDataFrame

Accessing GIS data

GIS users need to work with both published layers on remote servers (web layers) and local data, but the ability to manipulate these datasets without permanentently copying the data is lacking. The SpatialDataFrame solves this problem because it is an in-memory object that can read, write and manipulate geospatial data.

The SDF integrates with Esri's ArcPy site-package as well as the open source pyshp, shapely and fiona packages. This means the ArcGIS API for Python SDF can use either of these geometry engines to provide you options for easily working with geospatial data regardless of your platform. The SDF transforms data into the formats you desire so you can use Python functionality to analyze and visualize geographic information.

Data can be read and scripted to automate workflows and just as easily visualized on maps in Jupyter notebooks. The SDF can export data as feature classes or publish them directly to servers for sharing according to your needs.

Let's explore some of the different options available with the versatile SpatialDataFrame object:

Reading Web Layers

Feature layers hosted on ArcGIS Online or ArcGIS Enterprise can be easily read into a Spatial DataFrame using the from_layer method. Once you read it into a SDF object, you can create reports, manipulate the data, or convert it to a form that is comfortable and makes sense for its intended purpose.

Example: Retrieving an ArcGIS Online item and using the layers property to inspect the first 5 records of the layer

In [2]:
from arcgis import GIS
item = GIS().content.get("85d0ca4ea1ca4b9abf0c51b9bd34de2e")
flayer = item.layers[0]
sdf = SpatialDataFrame.from_layer(flayer)
sdf.head()
Out[2]:
AGE_10_14 AGE_15_19 AGE_20_24 AGE_25_34 AGE_35_44 AGE_45_54 AGE_55_64 AGE_5_9 AGE_65_74 AGE_75_84 ... PLACEFIPS POP2010 POPULATION POP_CLASS RENTER_OCC ST STFIPS VACANT WHITE SHAPE
0 2144 2314 2002 3531 3887 5643 6353 2067 5799 2850 ... 0408220 39540 40346 6 6563 AZ 04 6703 32367 {'x': -12751215.004681978, 'y': 4180278.406256...
1 876 867 574 1247 1560 2122 2342 733 2157 975 ... 0424895 14364 14847 6 1397 AZ 04 1389 12730 {'x': -12755627.731115643, 'y': 4164465.572856...
2 1000 1003 833 2311 2063 2374 3631 1068 6165 3776 ... 0425030 26265 26977 6 1963 AZ 04 9636 22995 {'x': -12734674.294574209, 'y': 3850472.723091...
3 2730 2850 2194 4674 5240 7438 8440 2499 8145 4608 ... 0439370 52527 55041 7 6765 AZ 04 9159 47335 {'x': -12725332.21151233, 'y': 4096532.0908223...
4 2732 2965 2024 3182 3512 3109 1632 2497 916 467 ... 0463470 25505 29767 6 1681 AZ 04 572 16120 {'x': -12770984.257542243, 'y': 3826624.133935...

5 rows × 51 columns

Reading Feature Layer Data

As seen above, the SDF can consume a Feature Layer service accessible on the ArcGIS Online platform. Let's take a step-by-step approach to break down the notebook cell above and then extract a subset of records from the feature layer.

Example: Examining Feature Layer content

Use the from_layer method on the SDF to instantiate a data frame from an item's layer and inspect the first 5 records.

In [3]:
# Log into ArcGIS anonymously
g = GIS()
# Retrieve an item from ArcGIS Online from a known ID value
known_item = g.content.get("85d0ca4ea1ca4b9abf0c51b9bd34de2e")
known_item
Out[3]:
USA Major Cities
This layer presents the locations of cities within the United States with populations of approximately 10,000 or greater, all state capitals, and the national capital.Feature Layer Collection by esri_dm
Last Modified: December 21, 2017
3 comments, 264,016 views
In [4]:
# Obtain the first feature layer from the item
fl = known_item.layers[0]

# Use the `from_layer` method of the Spatial DataFrame to create a new Spatial DataFrame
sdf = SpatialDataFrame.from_layer(fl)

# Return the first 5 records. 
sdf.head()
Out[4]:
AGE_10_14 AGE_15_19 AGE_20_24 AGE_25_34 AGE_35_44 AGE_45_54 AGE_55_64 AGE_5_9 AGE_65_74 AGE_75_84 ... PLACEFIPS POP2010 POPULATION POP_CLASS RENTER_OCC ST STFIPS VACANT WHITE SHAPE
0 2144 2314 2002 3531 3887 5643 6353 2067 5799 2850 ... 0408220 39540 40346 6 6563 AZ 04 6703 32367 {'x': -12751215.004681978, 'y': 4180278.406256...
1 876 867 574 1247 1560 2122 2342 733 2157 975 ... 0424895 14364 14847 6 1397 AZ 04 1389 12730 {'x': -12755627.731115643, 'y': 4164465.572856...
2 1000 1003 833 2311 2063 2374 3631 1068 6165 3776 ... 0425030 26265 26977 6 1963 AZ 04 9636 22995 {'x': -12734674.294574209, 'y': 3850472.723091...
3 2730 2850 2194 4674 5240 7438 8440 2499 8145 4608 ... 0439370 52527 55041 7 6765 AZ 04 9159 47335 {'x': -12725332.21151233, 'y': 4096532.0908223...
4 2732 2965 2024 3182 3512 3109 1632 2497 916 467 ... 0463470 25505 29767 6 1681 AZ 04 572 16120 {'x': -12770984.257542243, 'y': 3826624.133935...

5 rows × 51 columns

NOTE: See Pandas DataFrame head() method documentation for details.

You can also use sql queries to return a subset of records by leveraging the ArcGIS API for Python's Feature Layer object itself. Instantiate a Pandas DataFrame directly from the FeatureLayer.query() method and use the data frame's head() method to return the first 5 records and a subset of columns from the DataFrame:

Example: Feature Layer Query Results to a Spatial DataFrame

We'll use the AGE_45_54 column to query the dataframe and return a new DataFrame with a subset of records. We can use the built-in zip() function to print the data frame attribute field names, and then use data frame syntax to view specific attribute fields in the output:

In [5]:
# Filter feature layer records with a sql query. 
# See https://developers.arcgis.com/rest/services-reference/query-feature-service-layer-.htm

df = fl.query(where="AGE_45_54 < 1500").df
In [6]:
for a,b,c,d in zip(df.columns[::4], df.columns[1::4],df.columns[2::4], df.columns[3::4]):
    print("{:<30}{:<30}{:<30}{:<}".format(a,b,c,d))
AGE_10_14                     AGE_15_19                     AGE_20_24                     AGE_25_34
AGE_35_44                     AGE_45_54                     AGE_55_64                     AGE_5_9
AGE_65_74                     AGE_75_84                     AGE_85_UP                     AGE_UNDER5
AMERI_ES                      ASIAN                         AVE_FAM_SZ                    AVE_HH_SZ
BLACK                         CAPITAL                       CLASS                         FAMILIES
FEMALES                       FHH_CHILD                     FID                           HAWN_PI
HISPANIC                      HOUSEHOLDS                    HSEHLD_1_F                    HSEHLD_1_M
HSE_UNITS                     MALES                         MARHH_CHD                     MARHH_NO_C
MED_AGE                       MED_AGE_F                     MED_AGE_M                     MHH_CHILD
MULT_RACE                     NAME                          OBJECTID                      OTHER
OWNER_OCC                     PLACEFIPS                     POP2010                       POPULATION
POP_CLASS                     RENTER_OCC                    ST                            STFIPS
In [7]:
# Return a subset of columns on just the first 5 records
df[['NAME', 'AGE_45_54', 'POP2010']].head()
Out[7]:
NAME AGE_45_54 POP2010
0 Somerton 1411 14287
1 Anderson 1333 9932
2 Camp Pendleton South 127 10616
3 Citrus 1443 10866
4 Commerce 1478 12823

Accessing local GIS data

The SDF can also access local geospatial data. Depending upon what Python modules you have installed, you'll have access to a wide range of functionality:

Example: Reading a Shapefile

You must authenticate to ArcGIS Online or ArcGIS Enterprise to use the from_featureclass() method to read a shapefile with a Python interpreter that does not have access to ArcPy.

g2 = GIS("https://www.arcgis.com", "username", "password")

In [37]:
g2 = GIS("https://python.playground.esri.com/portal", "arcgis_python", "amazing_arcgis_123")
In [12]:
sdf = SpatialDataFrame.from_featureclass("path\to\your\data\census_example\cities.shp")
sdf.tail()
Out[12]:
index AGE_18_21 AGE_22_29 AGE_30_39 AGE_40_49 AGE_50_64 AGE_5_17 AGE_65_UP AGE_UNDER5 AMERI_ES ... PLACEFIP POP2000 POP2007 POP_CLASS RENTER_OCC SHAPE ST STFIPS VACANT WHITE
3552 3552 2037 4633 7435 7281 7553 7921 9203 2625 225 ... 22960 48688 49054 6 8434 {'x': -71.3608270663031, 'y': 41.80150017826884} RI 44 779 42111
3553 3553 3642 8337 11883 10437 9680 13233 10828 4918 217 ... 54640 72958 74309 7 16716 {'x': -71.3759815680945, 'y': 41.87550016490559} RI 44 1772 55004
3554 3554 4629 10985 13797 11979 12797 16333 15572 5846 172 ... 23000 91938 92864 7 25238 {'x': -71.1469910908576, 'y': 41.69810015677676} MA 25 3098 83815
3555 3555 704 1253 2560 2716 3448 2927 3835 791 22 ... 62465 18234 18531 6 1264 {'x': -71.15319106847441, 'y': 41.748500174901... MA 25 156 17909
3556 3556 4903 10818 13498 12598 12976 17055 15648 6272 579 ... 45000 93768 96834 7 21467 {'x': -70.93370908847608, 'y': 41.65180015540609} MA 25 3333 73950

5 rows × 48 columns

Saving Spatial DataFrames

The SDF can export data to various data formats for use in other applications.

Export Options

Export to Feature Class

The SDF allows for the export of whole datasets or partial datasets.

Example: Export a whole dataset to a shapefile:

In [13]:
sdf.to_featureclass(out_location=r"path\to\your\data\output_example",
                   out_name="output_cities.shp")
Out[13]:
'/path/to/your/data/output_example/output_cities.shp'

The ArcGIS API for Python installs on all macOS and Linux machines, as well as those Windows machines not using Python interpreters that have access to ArcPy will only be able to write out to shapefile format with the to_featureclass method. Writing to file geodatabases requires the ArcPy site-package.

Example: Export dataset with a subset of columns and top 5 records to a shapefile:

In [14]:
for a,b,c,d in zip(sdf.columns[::4], sdf.columns[1::4], sdf.columns[2::4], sdf.columns[3::4]):
    print("{:<30}{:<30}{:<30}{:<}".format(a,b,c,d))
index                         AGE_18_21                     AGE_22_29                     AGE_30_39
AGE_40_49                     AGE_50_64                     AGE_5_17                      AGE_65_UP
AGE_UNDER5                    AMERI_ES                      AREALAND                      AREAWATER
ASIAN                         AVE_FAM_SZ                    AVE_HH_SZ                     BLACK
CAPITAL                       CLASS                         FAMILIES                      FEMALES
FHH_CHILD                     HAWN_PI                       HISPANIC                      HOUSEHOLDS
HSEHLD_1_F                    HSEHLD_1_M                    HSE_UNITS                     MALES
MARHH_CHD                     MARHH_NO_C                    MED_AGE                       MED_AGE_F
MED_AGE_M                     MHH_CHILD                     MULT_RACE                     NAME
OTHER                         OWNER_OCC                     PLACEFIP                      POP2000
POP2007                       POP_CLASS                     RENTER_OCC                    SHAPE
ST                            STFIPS                        VACANT                        WHITE
In [15]:
columns = ['NAME', 'ST', 'CAPITAL', 'STFIPS', 'POP2000', 'POP2007', 'SHAPE']
sdf[columns].head().to_featureclass(out_location=r"/path/to/your/data/directory",
                                    out_name="sdf_head_output.shp")
Out[15]:
'/path/to/your/data/directory/sdf_head_output.shp'

Feedback on this topic?