Learn how to download data from ArcGIS Online using the ArcGIS API for Python.
In this tutorial you will download and import data taken from the Los Angeles GeoHub using the ArcGIS API for Python. The data sets include a Trailheads (CSV), Trails (GeoJSON), and a Parks and Open Space (Shapefile) file.
Import the GIS class and create a connection to ArcGIS Online. You will also load Path from pathlib and ZipFile from the Python standard library. Because the data is public, we can use an anonymous connection to ArcGIS Online to download the data.
Create a variable to store the ID of the public data item we want to download. Any Item hosted on ArcGIS Online has a unique item ID attribute that can be referenced. The data item for this tutorial was prepared in advance, so we retrieved the item ID from the REST endpoint of the data (visible as the id parameter in the query component of the URL in the address bar).
The content property for gis is an instance of a ContentManager that is used to manage content on ArcGIS Online. get() makes an ArcGIS REST API request to retrieve an Item object.
Use dark colors for code blocks
Add line.Add line.Add line.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
gis = GIS()
public_data_item_id = 'a04933c045714492bda6886f355416f2'# `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`data_item = gis.content.get(public_data_item_id)
data_item
Download LA_Hub_datasets.zip to the notebook server's current location.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
public_data_item_id = 'a04933c045714492bda6886f355416f2'# `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`data_item = gis.content.get(public_data_item_id)
data_item
# configure where to save the data, and where the ZIP file is locateddata_path = Path('./data')
ifnot data_path.exists():
data_path.mkdir()
zip_path = data_path.joinpath('LA_Hub_Datasets.zip')
extract_path = data_path.joinpath('LA_Hub_datasets')
data_item.download(save_path=data_path)
Use ZipFile to extract the contents of the dataset.
Use dark colors for code blocks
Add line.Add line.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`data_item = gis.content.get(public_data_item_id)
data_item
# configure where to save the data, and where the ZIP file is locateddata_path = Path('./data')
ifnot data_path.exists():
data_path.mkdir()
zip_path = data_path.joinpath('LA_Hub_Datasets.zip')
extract_path = data_path.joinpath('LA_Hub_datasets')
data_item.download(save_path=data_path)
zip_file = ZipFile(zip_path)
zip_file.extractall(path=data_path)
Call glob('*') on the extract_path to list the contents of the data directory.
Use dark colors for code blocks
Add line.Add line.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from arcgis.gis import GIS
from pathlib import Path
from zipfile import ZipFile
gis = GIS()
public_data_item_id = 'a04933c045714492bda6886f355416f2'# `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`data_item = gis.content.get(public_data_item_id)
data_item
# configure where to save the data, and where the ZIP file is locateddata_path = Path('./data')
ifnot data_path.exists():
data_path.mkdir()
zip_path = data_path.joinpath('LA_Hub_Datasets.zip')
extract_path = data_path.joinpath('LA_Hub_datasets')
data_item.download(save_path=data_path)
zip_file = ZipFile(zip_path)
zip_file.extractall(path=data_path)
files = [file.name for file in extract_path.glob('*')]
files