Export survey data with attachments

This notebook uses the ArcGIS API for Python. For more information, see the ArcGIS API for Python documentation and guides.

A common ArcGIS Survey123 workflow is to export survey data as a .xlsx file and work with that data in other software. A limitation of exporting data from your ArcGIS organization in Microsoft Excel format is that the attachments are not included. The only way to include attachments when exporting a dataset from the Survey123 website, ArcGIS Online, or ArcGIS Enterprise is to export it as a file geodatabase, but this method still requires additional work to extract the attachments.

Fortunately, you can use the ArcGIS API for Python to export your survey data in XLSX format and save any attachments to your computer.

This notebook will export your survey results as a .xlsx file, download all attachments associated with the data into folders unique to the parent layer or table, and create additional CSV files that map which parent object ID corresponds to which attachment file path.

import arcgis
from arcgis.gis import GIS
import os, re, csv

The first step is to define the variables needed to complete this workflow. The required variables are as follows:

portalURL - The URL for your ArcGIS organization (e.g. www.arcgis.com)
username - Your ArcGIS organization username (e.g. gisadmin)
password - You ArcGIS organization password (e.g. gisadmin1)
survey_item_id - The item ID for the ArcGIS Survey123 form item in your ArcGIS organization (e.g. 89bc8c7844e548e09baa3aad4695e78b)
save_path - The directory where you would like to save the survey results and attachments (e.g. C:\temp)
keep_org_item - By default, an exported item is added to your content in your ArcGIS organization. This Boolean value allows you to choose if you would like to keep the exported item in your content (True), or remove it (False).
store_csv_w_attachments - Boolean value that allows you to choose if the .csv file that maps each attachment to its parent object ID should be stored in the root folder (with the exported Excel workbook) (False), or in each individual layer folder (True).

portalURL = "https://www.arcgis.com"
username = "username"
password = "password"
survey_item_id = "c0da843348f24012885615cb6d611d8d"
save_path = r"C:\temp\download_with_attachments"
keep_org_item = False
store_csv_w_attachments = False

A connection is made to your ArcGIS organization and to the survey specified above.

gis = GIS(portalURL, username, password)
survey_by_id = gis.content.get(survey_item_id)

Next, the data in the survey's feature service is downloaded in XLSX format.

rel_fs = survey_by_id.related_items('Survey2Service','forward')[0]
item_excel = rel_fs.export(title=survey_by_id.title, export_format='Excel')
item_excel.download(save_path=save_path)
if not bool(keep_org_item):
    item_excel.delete(force=True)

Now that the data is downloaded, it's time to work with the attachments. Each layer and table in the feature service is looped through, and if the layer or table has attachments a new folder is created using the layer name with _attachments appended to it. A new .csv file is created, containing the parent object ID and the relative folder path for each attachment in that layer. The .csv file is saved to either the directory specified in the save_path variable, or in the layers folder, depending on the value of store_csv_w_attachments.

The layer is then queried to return the object IDs. Each object ID is queried to see if it has attachments. If it does have attachments, these are downloaded to the associated folder and a new entry is written to the .csv file. If the object ID does not have attachments, the record is skipped.

When the notebook finishes, all exported data and attachments will be contained in the save_path folder.

layers = rel_fs.layers + rel_fs.tables
for i in layers:
    if i.properties.hasAttachments == True:
        feature_layer_folder = os.path.join(save_path, '{}_attachments'.format(re.sub(r'[^A-Za-z0-9]+', '', i.properties.name)))
        os.mkdir(feature_layer_folder)
        if bool(store_csv_w_attachments):
            path = os.path.join(feature_layer_folder, "{}_attachments.csv".format(i.properties.name))
        elif not bool(store_csv_w_attachments):
            path = os.path.join(save_path, "{}_attachments.csv".format(i.properties.name))
        csv_fields = ['Parent objectId', 'Attachment path']
        with open(path, 'w', newline='') as csvfile:
            csvwriter = csv.writer(csvfile)
            csvwriter.writerow(csv_fields)
            
            feature_object_ids = i.query(where="1=1", return_ids_only=True, order_by_fields='objectid ASC')
            for j in range(len(feature_object_ids['objectIds'])):
                current_oid = feature_object_ids['objectIds'][j]
                current_oid_attachments = i.attachments.get_list(oid=current_oid)
            
                if len(current_oid_attachments) > 0:
                    for k in range(len(current_oid_attachments)):
                        attachment_id = current_oid_attachments[k]['id']
                        current_attachment_path = i.attachments.download(oid=current_oid, attachment_id=attachment_id, save_path=feature_layer_folder)
                        csvwriter.writerow([current_oid, os.path.join('{}_attachments'.format(re.sub(r'[^A-Za-z0-9]+', '', i.properties.name)), os.path.split(current_attachment_path[0])[1])])