Learn how to automate downloading data from portal using ArcGIS API for Python.
In this tutorial you will download and import data taken from the Los Angeles GeoHub using the ArcGIS API for Python. The data sets include a Trailheads (CSV), Trails (GeoJSON), and a Parks and Open Space (Shapefile) file.
The data will be stored locally on your machine.
Prerequisites
The ArcGIS API for Python tutorials use Jupyter Notebooks to execute Python code. If you are new to this environment, please see the guide to install the API and use notebooks locally.
Steps
Import modules and log in
-
Import the
GISclass and create a connection to ArcGIS Online. You will also loadPathfrompathlibandZipfrom the Python standard library. Because the data is public, we can use an anonymous connection to ArcGIS Online to download the data.File Use dark colors for code blocks from arcgis.gis import GIS from pathlib import Path from zipfile import ZipFile gis = GIS()
Access the item by ID
-
Create a variable to store the ID of the public data item.
Use dark colors for code blocks from arcgis.gis import GIS from pathlib import Path from zipfile import ZipFile gis = GIS() public_data_item_id = 'a04933c045714492bda6886f355416f2' -
The
contentproperty of aGISobject is an instance of aContentclass. This can be used to manage content in ArcGIS Online. TheManager get()method makes an HTTP request to retrieve an Item object.Use dark colors for code blocks gis = GIS() public_data_item_id = 'a04933c045714492bda6886f355416f2' # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item
Download the item
-
Download
LAto the notebook server's current location._Hub _datasets.zip Use dark colors for code blocks public_data_item_id = 'a04933c045714492bda6886f355416f2' # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item # configure where to save the data, and where the ZIP file is located data_path = Path('./data') if not data_path.exists(): data_path.mkdir() zip_path = data_path.joinpath('LA_Hub_Datasets.zip') extract_path = data_path.joinpath('LA_Hub_datasets') data_item.download(save_path=data_path) -
Use
Zipto extract the contents of the dataset.File Use dark colors for code blocks # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item # configure where to save the data, and where the ZIP file is located data_path = Path('./data') if not data_path.exists(): data_path.mkdir() zip_path = data_path.joinpath('LA_Hub_Datasets.zip') extract_path = data_path.joinpath('LA_Hub_datasets') data_item.download(save_path=data_path) zip_file = ZipFile(zip_path) zip_file.extractall(path=data_path) -
Call glob('*') on the
extractto list the contents of the data directory._path Use dark colors for code blocks from arcgis.gis import GIS from pathlib import Path from zipfile import ZipFile gis = GIS() public_data_item_id = 'a04933c045714492bda6886f355416f2' # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item # configure where to save the data, and where the ZIP file is located data_path = Path('./data') if not data_path.exists(): data_path.mkdir() zip_path = data_path.joinpath('LA_Hub_Datasets.zip') extract_path = data_path.joinpath('LA_Hub_datasets') data_item.download(save_path=data_path) zip_file = ZipFile(zip_path) zip_file.extractall(path=data_path) files = [file.name for file in extract_path.glob('*')] files