Download data

Learn how to download data from ArcGIS Online using the ArcGIS API for Python.

In this tutorial you will download and import data taken from the Los Angeles GeoHub using the ArcGIS API for Python. The data sets include a Trailheads (CSV), Trails (GeoJSON), and a Parks and Open Space (Shapefile) file.

The data will be stored locally on your machine.

Prerequisites

The ArcGIS API for Python tutorials use Jupyter Notebooks to execute Python code. If you are new to this environment, please see the guide to install the API and use notebooks locally.

Steps

  1. Import the GIS class and create a connection to ArcGIS Online. You will also load Path from pathlib and ZipFile from the Python standard library. Because the data is public, we can use an anonymous connection to ArcGIS Online to download the data.

    Use dark colors for code blocks
                              
    Add line.Add line.Add line.Add line.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    from arcgis.gis import GIS
    from pathlib import Path
    from zipfile import ZipFile
    
    gis = GIS()
    
    
  2. Create a variable to store the ID of the public data item we want to download. Any Item hosted on ArcGIS Online has a unique item ID attribute that can be referenced. The data item for this tutorial was prepared in advance, so we retrieved the item ID from the REST endpoint of the data (visible as the id parameter in the query component of the URL in the address bar).

    Use dark colors for code blocks
                              
    Add line.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    from arcgis.gis import GIS
    from pathlib import Path
    from zipfile import ZipFile
    
    gis = GIS()
    
    public_data_item_id = 'a04933c045714492bda6886f355416f2'
    
    
  3. The content property for gis is an instance of a ContentManager that is used to manage content on ArcGIS Online. get() makes an ArcGIS REST API request to retrieve an Item object.

    Use dark colors for code blocks
                              
    Add line.Add line.Add line.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    gis = GIS()
    
    public_data_item_id = 'a04933c045714492bda6886f355416f2'
    
    # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`
    data_item = gis.content.get(public_data_item_id)
    data_item
    
    
  4. Download LA_Hub_datasets.zip to the notebook server's current location.

    Use dark colors for code blocks
                              
    Add line.Add line.Add line.Add line.Add line.Add line.Add line.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    public_data_item_id = 'a04933c045714492bda6886f355416f2'
    
    # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`
    data_item = gis.content.get(public_data_item_id)
    data_item
    
    # configure where to save the data, and where the ZIP file is located
    data_path = Path('./data')
    if not data_path.exists():
        data_path.mkdir()
    zip_path = data_path.joinpath('LA_Hub_Datasets.zip')
    extract_path = data_path.joinpath('LA_Hub_datasets')
    data_item.download(save_path=data_path)
    
    
  5. Use ZipFile to extract the contents of the dataset.

    Use dark colors for code blocks
                              
    Add line.Add line.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`
    data_item = gis.content.get(public_data_item_id)
    data_item
    
    # configure where to save the data, and where the ZIP file is located
    data_path = Path('./data')
    if not data_path.exists():
        data_path.mkdir()
    zip_path = data_path.joinpath('LA_Hub_Datasets.zip')
    extract_path = data_path.joinpath('LA_Hub_datasets')
    data_item.download(save_path=data_path)
    
    zip_file = ZipFile(zip_path)
    zip_file.extractall(path=data_path)
    
    
  6. Call glob('*') on the extract_path to list the contents of the data directory.

    Use dark colors for code blocks
                              
    Add line.Add line.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    from arcgis.gis import GIS
    from pathlib import Path
    from zipfile import ZipFile
    
    gis = GIS()
    
    public_data_item_id = 'a04933c045714492bda6886f355416f2'
    
    # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2`
    data_item = gis.content.get(public_data_item_id)
    data_item
    
    # configure where to save the data, and where the ZIP file is located
    data_path = Path('./data')
    if not data_path.exists():
        data_path.mkdir()
    zip_path = data_path.joinpath('LA_Hub_Datasets.zip')
    extract_path = data_path.joinpath('LA_Hub_datasets')
    data_item.download(save_path=data_path)
    
    zip_file = ZipFile(zip_path)
    zip_file.extractall(path=data_path)
    
    files = [file.name for file in extract_path.glob('*')]
    files

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.