ArcGIS Developer
Dashboard

ArcGIS API for Python

Part-3 Data IO with SeDF - Exporting Data

Introduction

In part-2 of this guide series, we saw how GIS data can be accessed from various data formats using Spatially enabled DataFrame (SeDF). In this part of the guide series, we will look at how SeDF can be used to export the data to various spatial and non-spatial formats. We will also explore how local data can be easily overwritten using SeDF. Let's explore some of the different options available with the versatile Spatially enabled DataFrame.

The data used in this guide is provided as an item. We will start by importing some libaries and downloading and extracting the data needed for the analysis in this guide.

In [1]:
# Import Libraries
import pandas as pd
from arcgis.features import GeoAccessor, GeoSeriesAccessor
from arcgis.gis import GIS
from IPython.display import display
import zipfile
import os
import shutil
In [2]:
# Create a GIS connection
gis = GIS()
agol_gis = GIS("https://www.arcgis.com","arcgis_python","amazing_arcgis_123")
In [6]:
# Get the data item
data_item = gis.content.get('c7140ae3d7ae4fd0817181461019aa75')
data_item
Out[6]:
sedf_guide_data
Data for Spatially enabled DataFrame GuidesShapefile by api_data_owner
Last Modified: November 11, 2021
0 comments, 3 views

The cell below downloads and extracts the data from the data item to your machine.

In [7]:
# Download and extract the data
def unzip_data():
    """
    This function:
    - creates a directory `sedf_data` to download the data from the item
    - downloads the item as `sedf_guide_data.zip` file in the sedf_data directory
    - unzips and extracts the data to '.\sedf_data\cities'.
    """
    try:
        
        data_dir = os.path.join(os.getcwd(), 'sedf_data')    # path to downloaded data folder
        
        # remove existing cities directory if exists
        if os.path.isdir(data_dir):
            shutil.rmtree(data_dir)
            print(f'Removed existing data directory')
        else:
            os.makedirs(data_dir)
            
        data_item.download(data_dir)    # download the data item
        zipped_file_path = os.path.join(data_dir, 'sedf_guide_data.zip')    # path to zipped file inside data folder

        # unzip the data
        zip_ref = zipfile.ZipFile(zipped_file_path, 'r')
        zip_ref.extractall(data_dir)
        zip_ref.close()
        
        cities_dir = os.path.join(data_dir, 'cities')    # path to new cities directory
        print(f'Dataset unzipped at: {os.path.relpath(cities_dir)}')
        
    except Exception as e:
        print(f'Error unzipping file: {e}')
        

# Extract data
unzip_data()
Dataset unzipped at: sedf_data\cities

Create a SeDF

Here, we will create a SeDF and then export the data to various data formats.

In [8]:
# Retrieve an item from ArcGIS Online using Item ID value
gis = GIS()
item = gis.content.get("85d0ca4ea1ca4b9abf0c51b9bd34de2e")
item
Out[8]:
USA Major Cities
This layer presents the locations of cities within the United States with populations of approximately 10,000 or greater, all state capitals, and the national capital.Feature Layer Collection by esri_dm
Last Modified: May 19, 2020
1 comments, 33,763,272 views
In [9]:
# Obtain the first feature layer from the item
flayer = item.layers[0]

# Use the `from_layer` static method in the 'spatial' namespace on the Pandas' DataFrame
sdf = pd.DataFrame.spatial.from_layer(flayer)

# Check shape
sdf.shape
Out[9]:
(3886, 50)
In [10]:
# Check first few records
sdf.head()
Out[10]:
AGE_10_14 AGE_15_19 AGE_20_24 AGE_25_34 AGE_35_44 AGE_45_54 AGE_55_64 AGE_5_9 AGE_65_74 AGE_75_84 ... PLACEFIPS POP2010 POPULATION POP_CLASS RENTER_OCC SHAPE ST STFIPS VACANT WHITE
0 1313 1058 734 2031 1767 1446 1136 1503 665 486 ... 1601990 13816 15181 6 1271 {"x": -12462673.723706163, "y": 5384674.994080... ID 16 271 13002
1 890 817 818 1799 1235 1330 1143 1099 721 579 ... 1607840 11899 11946 6 1441 {"x": -12506251.313993266, "y": 5341537.793529... ID 16 318 9893
2 12750 13959 16966 32135 27048 29595 24177 12933 12176 7087 ... 1608830 205671 225405 8 33359 {"x": -12938676.6836459, "y": 5403597.04949123... ID 16 6996 182991
3 790 768 699 1445 1136 1134 935 959 679 464 ... 1611260 10345 10727 6 1461 {"x": -12667411.402393516, "y": 5241722.820606... ID 16 241 7984
4 3803 3779 3687 7571 5559 4744 3624 4397 2296 1222 ... 1612250 46237 53942 7 5196 {"x": -12989383.674504517, "y": 5413226.487333... ID 16 1428 35856

5 rows × 50 columns

In [11]:
# Check type of sdf
type(sdf)
Out[11]:
pandas.core.frame.DataFrame
In [12]:
# Access spatial namespace
sdf.spatial.geometry_type
Out[12]:
['point']

We can see that the dataset has 3886 records and 50 columns. Inspecting the type of sdf object and accessing the spatial namespace shows us that a Spatially enabled DataFrame has been created from all the data in the layer.

Writing GIS Data

The Spatially enabled DataFrame can export data to various data formats for use in other applications. Let's dive into the details of exporting GIS data to various sources.

Publish as a Feature Layer

Data in a Spatially enabled DataFrame can be exported to Feature layers hosted on ArcGIS Online or ArcGIS Enterprise using the to_featurelayer() method.

Let's export the sdf DataFrame, created above, to a feature layer stored in an ArcGIS Online organization.

In [13]:
# Export to feature layer
lyr = sdf.spatial.to_featurelayer('census_cities_export', gis=agol_gis)
lyr
Out[13]:
census_cities_export
Feature Layer Collection by arcgis_python
Last Modified: November 12, 2021
0 comments, 0 views
In [14]:
# Check type
type(lyr.layers[0])
Out[14]:
arcgis.features.layer.FeatureLayer

The census_cities_export feature layer has been created at the ArcGIS Online connection specified.

Write to JSON based formats

Data in a Spatially enabled DataFrame can be exported to JSON based formats, such as FeatureSet or FeatureCollection, using the to_featureset() and to_feature_collection() methods. Let's take a look.

Write to FeatureSet

The to_featureset() method can be used to export data from a SeDF into a FeatureSet.

In [15]:
# Write to FeatureSet
fset_exp = sdf.spatial.to_featureset()
In [16]:
# Check type
type(fset_exp)
Out[16]:
arcgis.features.feature.FeatureSet

A FeatureSet object has been created from the data in the SeDF.

Write to FeatureCollection

The to_feature_collection() method can be used to export data from a SeDF into a FeatureCollection.

In [17]:
# Write to FeatureCollection
fc_exp = sdf.spatial.to_feature_collection()
In [18]:
# Check type
type(fc_exp)
Out[18]:
arcgis.features.feature.FeatureCollection

A FeatureCollection object has been created from the data in the SeDF.

Write to a local file

Data in a Spatially enabled DataFrame can be exported to local spatial file formats, such as Feature classes or shapefiles, and non-spatial formats, such as csv files or tables. Let's take a look.

Write to local databases

The to_featureclass() method can be used to export spatial data from a SeDF into various local databases, such as a File geodatabase, a Mobile geodatabase (.geodatabase), or a SQLite Database.

File Geodatabase
Note: In the absence of arcpy, the Fiona package must be present in your current conda environment to perform this operation.
In [20]:
# Export to a feature class in File Geodatabase
sdf.spatial.to_featureclass(location="./sedf_data/cities/cities.gdb/major_cities_export")
Out[20]:
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities.gdb\\major_cities_export'

A Feature Class has been created in a File Geodatabase from the data in the SeDF.

Mobile Geodatabase
Note: This operation can only be performed in an environment that contains arcpy.
In [21]:
# Export to a feature class in Mobile Geodatabase
sdf.spatial.to_featureclass(location="./sedf_data/cities/cities_mobile.geodatabase/major_cities_export")
Out[21]:
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities_mobile.geodatabase\\main.major_cities_export'

A Feature Class has been created in a Mobile Geodatabase from the data in the SeDF.

SQLite Database
Note: This operation can only be performed in an environment that contains arcpy.
In [22]:
# Export to a feature class in SQLite Database
sdf.spatial.to_featureclass(location="./sedf_data/cities/cities.sqlite/major_cities_export")
Out[22]:
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities.sqlite\\major_cities_export'

A Feature Class has been created in a SQLite Database from the data in the SeDF.

Write to a shapefile

The to_featureclass() method can also be used to export spatial data from a SeDF into a shapefile.

Note: In the absence of arcpy, the Fiona package must be present in your current conda environment to perform this operation.
In [23]:
# Export to a shapefile
sdf.spatial.to_featureclass(location="./sedf_data/cities/major_cities_export.shp")
Out[23]:
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\major_cities_export.shp'

A Shapefile has been created from the data in the SeDF.

Write to Non-spatial formats

The to_table() method can be used to export data from a SeDF into non-spatial formats, such as csv files or tables.

Write to a csv file
In [24]:
# Export to a csv file
sdf.spatial.to_table(location="./sedf_data/cities/cities_table_export.csv")
Out[24]:
'./sedf_data/cities/cities_table_export.csv'

A csv file has been created from the data in the SeDF.

Write to a table in a File Geodatabase
Note: The operation below can only be performed in an environment that contains arcpy.
In [25]:
# Export to a table in a File Geodatabase
sdf.spatial.to_table(location="./sedf_data/cities/cities.gdb/cities_table_export")
Out[25]:
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities.gdb\\cities_table_export'

A table has been created in a File Geodatabase from the data in the SeDF.

Overwriting GIS Data

The GIS data stored locally can be easily overwritten using the Spatially enabled DataFrame. Let's take a look.

Overwrite a Featureclass

The default overwrite=True argument in the to_featureclass() method can be used to overwrite an existing feature class from the data in a SeDF.

The major_cities_export featureclass was created in a section above using sdf. We will overwrite this featureclass with a subset of the data from sdf.

In [26]:
# Subset the data
sub_df = sdf.iloc[:10,-13:].copy()
sub_df.shape
Out[26]:
(10, 13)
In [27]:
# Check head
sub_df.head(2)
Out[27]:
NAME OTHER OWNER_OCC PLACEFIPS POP2010 POPULATION POP_CLASS RENTER_OCC SHAPE ST STFIPS VACANT WHITE
0 Ammon 307 3205 1601990 13816 15181 6 1271 {"x": -12462673.723706163, "y": 5384674.994080... ID 16 271 13002
1 Blackfoot 1077 2788 1607840 11899 11946 6 1441 {"x": -12506251.313993266, "y": 5341537.793529... ID 16 318 9893
Note: In the absence of arcpy, the Fiona package must be present in your current conda environment to perform this operation.
In [28]:
# Export sub_df to the existing major_cities_export featureclass
sub_df.spatial.to_featureclass(location="./sedf_data/cities/cities.gdb/major_cities_export", overwrite=True)
Out[28]:
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities.gdb\\major_cities_export'
In [29]:
# Check if the featureclass is updated
fc_new_df = pd.DataFrame.spatial.from_featureclass(location="./sedf_data/cities/cities.gdb/major_cities_export")
fc_new_df.shape
Out[29]:
(10, 14)

The featureclass has been overwritten with new data.

Overwrite a table

The default overwrite=True argument in the to_table() method can be used to overwrite an existing non-spatial table from the data in a SeDF.

The cities_table_export table was created in a section above using sdf. We will overwrite this table with a subset of the data sub_df defined above.

Table in a csv file

In [30]:
# Export sub_df to an existing cities_table_export.csv file
sub_df.spatial.to_table(location="./sedf_data/cities/cities_table_export.csv", overwrite=True)
Out[30]:
'./sedf_data/cities/cities_table_export.csv'
In [31]:
# Check if the csv file is updated
tbl_new_df = pd.DataFrame.spatial.from_table(filename="./sedf_data/cities/cities_table_export.csv")
tbl_new_df.shape
Out[31]:
(10, 14)

The csv file has been overwritten with new data.

Table in a File Geodatabase

Note: The operations below can only be performed in an environment that contains arcpy.
In [32]:
# Export sub_df to an existing table in a File Geodatabase
sub_df.spatial.to_table(location="./sedf_data/cities/cities.gdb/cities_table_export")
Out[32]:
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities.gdb\\cities_table_export'
In [33]:
# Check if the table file is updated
tbl_new_df2 = pd.DataFrame.spatial.from_table(filename="./sedf_data/cities/cities.gdb/cities_table_export")
tbl_new_df2.shape
Out[33]:
(10, 13)

The table file has been overwritten with new data.

Memory-based Workspace

Writing geoprocessing outputs to memory is an alternative to writing output to a geodatabase or file-based format. It is often significantly faster than writing to on-disk formats. Data written into memory is temporary and is deleted when the application is closed, so it is an ideal location to write intermediate data.

ArcGIS provides two memory-based workspaces where geoprocessing outputs can be written.

  • memory - is a new memory-based workspace developed for ArcGIS Pro that supports output feature classes, tables, and raster datasets.
  • in_memory - is the legacy memory-based workspace built for ArcMap that supports output feature classes, tables, and raster datasets.

Let's look at an example of writing to a memory workspace. Here, we will:

  • write data from SeDF to a memory workspace.
  • use the data in the memory workspace to generate buffers and export the results to another memory workspace.
  • see how results in a memory workspace can be converted to a featureclass.
  • delete memory workspaces.
Caution:
    - Memory-based workspaces do not support geodatabase elements, such as feature datasets, representations, topologies, geometric networks, or network datasets.
    - Folders cannot be created in memory-based workspaces.
    - Since memory-based workspaces are stored in your system's physical memory, or RAM, your system may run low on memory if you write large datasets into the workspace. This can negatively impact processing performance.
Note: The operations below can only be performed in an environment that contains arcpy.
In [34]:
# Import arcpy
import arcpy
In [35]:
# Check head
sub_df.head(2)
Out[35]:
NAME OTHER OWNER_OCC PLACEFIPS POP2010 POPULATION POP_CLASS RENTER_OCC SHAPE ST STFIPS VACANT WHITE
0 Ammon 307 3205 1601990 13816 15181 6 1271 {"x": -12462673.723706163, "y": 5384674.994080... ID 16 271 13002
1 Blackfoot 1077 2788 1607840 11899 11946 6 1441 {"x": -12506251.313993266, "y": 5341537.793529... ID 16 318 9893
In [36]:
# Write data from SeDF to a memory workspace.
sub_df.spatial.to_featureclass(r"memory\sub_df")
Out[36]:
'memory\\sub_df'
In [37]:
# Use data in memory to generate buffers, exporting output to memory 
arcpy.Buffer_analysis(in_features=r"memory\sub_df", 
                      out_feature_class="memory\subBuffers", 
                      buffer_distance_or_field=1)
Out[37]:

Output

memory\subBuffers

Messages

Start Time: Friday, November 12, 2021 12:14:14 PM
Succeeded at Friday, November 12, 2021 12:14:15 PM (Elapsed Time: 0.08 seconds)
In [38]:
# Read buffer output into a SeDF
buffered_df = pd.DataFrame.spatial.from_featureclass(r"memory\subBuffers")
buffered_df.shape
Out[38]:
(10, 16)
In [39]:
# Check head
buffered_df.head(2)
Out[39]:
OBJECTID name other owner_occ placefips pop2010 population pop_class renter_occ st stfips vacant white BUFF_DIST ORIG_FID SHAPE
0 1 Ammon 307 3205 1601990 13816 15181 6 1271 ID 16 271 13002 1.0 1 {"curveRings": [[[-12462673.7237, 5384675.9940...
1 2 Blackfoot 1077 2788 1607840 11899 11946 6 1441 ID 16 318 9893 1.0 2 {"curveRings": [[[-12506251.314, 5341538.79349...
In [43]:
# Convert buffer results to a featureclass
arcpy.Dissolve_management(r"memory\subBuffers", "./sedf_data/cities/cities.gdb/memBuffers2")
Out[43]:

Output

.\sedf_data\cities\cities.gdb\memBuffers2

Messages

Start Time: Friday, November 12, 2021 12:19:46 PM
Dissolving...
Succeeded at Friday, November 12, 2021 12:19:46 PM (Elapsed Time: 0.47 seconds)
In [44]:
# Delete the in-memory item
arcpy.Delete_management(r"memory\sub_df")
Out[44]:

Output

true

Messages

Start Time: Friday, November 12, 2021 12:20:45 PM
Succeeded at Friday, November 12, 2021 12:20:45 PM (Elapsed Time: 0.00 seconds)
In [45]:
# Delete the in-memory item
arcpy.Delete_management(r"memory\subBuffers")
Out[45]:

Output

true

Messages

Start Time: Friday, November 12, 2021 12:20:47 PM
Succeeded at Friday, November 12, 2021 12:20:47 PM (Elapsed Time: 0.00 seconds)

Conclusion

In this guide, we explored how Spatially enabled DataFrame (SeDF) can be used to export spatial data to various formats. We started by exporting the data to web feature layers and to in-memory JSON based formats, such as FeatureSet and FeatureCollection. Next, we explored writing the data to various local data sources, such as a file geodatabase, a mobile geodatabase, an sqlite database, and a shapefile. We also discussed exporting the data to non-spatial formats, such as a csv file or a table. We introduced how the data in local file formats, such as a feature class or a table in a File Geodatabase, can be overwritten using a SeDF. Towards the end, we discussed how the data from SeDF can be exported to in-memory workspaces.

In the next part of the guide series, you will learn about the various properties of a SeDF and how they can be used to pre-process a SeDF.

Note: Given the importance and popularity of Spatially enabled DataFrame, we are revisiting our documentation for this topic. Our goal is to enhance the existing documentation to showcase the various capabilities of Spatially enabled DataFrame in detail with even more examples this time. Creating quality documentation is time-consuming and exhaustive, but we are committed to providing you with the best experience possible. With that in mind, we will be rolling out the revamped guides on this topic as different parts of a guide series (like the Data Engineering or Geometry guide series). This is "part-3" of the guide series for Spatially Enabled DataFrame. You will continue to see the existing documentation as we revamp it to add new parts. Stay tuned for more on this topic.

Feedback on this topic?