Learn how to download data from ArcGIS Online using the ArcGIS API for Python.
In this tutorial you will download and import data taken from the Los Angeles GeoHub using the ArcGIS API for Python. The data sets include a Trailheads (CSV), Trails (GeoJSON), and a Parks and Open Space (Shapefile) file.
The data will be stored locally on your machine.
Prerequisites
The ArcGIS API for Python tutorials use Jupyter Notebooks to execute Python code. If you are new to this environment, please see the guide to install the API and use notebooks locally.
Steps
-
Import the
GIS
class and create a connection to ArcGIS Online. You will also loadPath
frompathlib
andZip
from the Python standard library. Because the data is public, we can use an anonymous connection to ArcGIS Online to download the data.File Use dark colors for code blocks from arcgis.gis import GIS from pathlib import Path from zipfile import ZipFile gis = GIS()
-
Create a variable to store the ID of the public data item we want to download. Any Item hosted on ArcGIS Online has a unique item ID attribute that can be referenced. The data item for this tutorial was prepared in advance, so we retrieved the item ID from the REST endpoint of the data (visible as the id parameter in the query component of the URL in the address bar).
Use dark colors for code blocks from arcgis.gis import GIS from pathlib import Path from zipfile import ZipFile gis = GIS() public_data_item_id = 'a04933c045714492bda6886f355416f2'
-
The
content
property forgis
is an instance of aContent
that is used to manage content on ArcGIS Online.Manager get()
makes an ArcGIS REST API request to retrieve an Item object.Use dark colors for code blocks gis = GIS() public_data_item_id = 'a04933c045714492bda6886f355416f2' # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item
-
Download
LA_
to the notebook server's current location.H u b_ datasets.zip Use dark colors for code blocks public_data_item_id = 'a04933c045714492bda6886f355416f2' # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item # configure where to save the data, and where the ZIP file is located data_path = Path('./data') if not data_path.exists(): data_path.mkdir() zip_path = data_path.joinpath('LA_Hub_Datasets.zip') extract_path = data_path.joinpath('LA_Hub_datasets') data_item.download(save_path=data_path)
-
Use
Zip
to extract the contents of the dataset.File Use dark colors for code blocks # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item # configure where to save the data, and where the ZIP file is located data_path = Path('./data') if not data_path.exists(): data_path.mkdir() zip_path = data_path.joinpath('LA_Hub_Datasets.zip') extract_path = data_path.joinpath('LA_Hub_datasets') data_item.download(save_path=data_path) zip_file = ZipFile(zip_path) zip_file.extractall(path=data_path)
-
Call glob('*') on the
extract_
to list the contents of the data directory.path Use dark colors for code blocks from arcgis.gis import GIS from pathlib import Path from zipfile import ZipFile gis = GIS() public_data_item_id = 'a04933c045714492bda6886f355416f2' # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item # configure where to save the data, and where the ZIP file is located data_path = Path('./data') if not data_path.exists(): data_path.mkdir() zip_path = data_path.joinpath('LA_Hub_Datasets.zip') extract_path = data_path.joinpath('LA_Hub_datasets') data_item.download(save_path=data_path) zip_file = ZipFile(zip_path) zip_file.extractall(path=data_path) files = [file.name for file in extract_path.glob('*')] files