Tutorial: Download data

Learn how to automate downloading data from using ArcGIS API for Python.

Download data

In this tutorial you will download and import data taken from the Los Angeles GeoHub using the ArcGIS API for Python. The data sets include a Trailheads (CSV), Trails (GeoJSON), and a Parks and Open Space (Shapefile) file.

The data will be stored locally on your machine.

Prerequisites

The ArcGIS API for Python tutorials use Jupyter Notebooks to execute Python code. If you are new to this environment, please see the guide to install the API and use notebooks locally.

Steps

Import modules and log in

  1. Import the GIS class and create a connection to ArcGIS Online. You will also load Path from pathlib and ZipFile from the Python standard library. Because the data is public, we can use an anonymous connection to ArcGIS Online to download the data.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    from arcgis.gis import GIS
    from pathlib import Path
    from zipfile import ZipFile
    
    gis = GIS()
    
    

Access the item by ID

  1. Create a variable to store the ID of the public data item.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    from arcgis.gis import GIS
    from pathlib import Path
    from zipfile import ZipFile
    
    gis = GIS()
    
    public_data_item_id = 'a04933c045714492bda6886f355416f2'
    
    
  2. The content property of a GIS object is an instance of a ContentManager class. This can be used to manage in ArcGIS Online. The get() method makes an HTTP request to retrieve an Item object.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    gis = GIS()
    
    public_data_item_id = 'a04933c045714492bda6886f355416f2'
    
    # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`
    data_item = gis.content.get(public_data_item_id)
    data_item
    
    

Download the item

  1. Download LA_Hub_datasets.zip to the notebook server's current location.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    public_data_item_id = 'a04933c045714492bda6886f355416f2'
    
    # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`
    data_item = gis.content.get(public_data_item_id)
    data_item
    
    # configure where to save the data, and where the ZIP file is located
    data_path = Path('./data')
    if not data_path.exists():
        data_path.mkdir()
    zip_path = data_path.joinpath('LA_Hub_Datasets.zip')
    extract_path = data_path.joinpath('LA_Hub_datasets')
    data_item.download(save_path=data_path)
    
    
  2. Use ZipFile to extract the contents of the dataset.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`
    data_item = gis.content.get(public_data_item_id)
    data_item
    
    # configure where to save the data, and where the ZIP file is located
    data_path = Path('./data')
    if not data_path.exists():
        data_path.mkdir()
    zip_path = data_path.joinpath('LA_Hub_Datasets.zip')
    extract_path = data_path.joinpath('LA_Hub_datasets')
    data_item.download(save_path=data_path)
    
    zip_file = ZipFile(zip_path)
    zip_file.extractall(path=data_path)
    
    
  3. Call glob('*') on the extract_path to list the contents of the data directory.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    from arcgis.gis import GIS
    from pathlib import Path
    from zipfile import ZipFile
    
    gis = GIS()
    
    public_data_item_id = 'a04933c045714492bda6886f355416f2'
    
    # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`
    data_item = gis.content.get(public_data_item_id)
    data_item
    
    # configure where to save the data, and where the ZIP file is located
    data_path = Path('./data')
    if not data_path.exists():
        data_path.mkdir()
    zip_path = data_path.joinpath('LA_Hub_Datasets.zip')
    extract_path = data_path.joinpath('LA_Hub_datasets')
    data_item.download(save_path=data_path)
    
    zip_file = ZipFile(zip_path)
    zip_file.extractall(path=data_path)
    
    files = [file.name for file in extract_path.glob('*')]
    files

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.

You can no longer sign into this site. Go to your ArcGIS portal or the ArcGIS Location Platform dashboard to perform management tasks.

Your ArcGIS portal

Create, manage, and access API keys and OAuth 2.0 developer credentials, hosted layers, and data services.

Your ArcGIS Location Platform dashboard

Manage billing, monitor service usage, and access additional resources.

Learn more about these changes in the What's new in Esri Developers June 2024 blog post.

Close