Overwriting feature layers

As content publishers, you may be required to keep certain web layers up to date. As new data arrives, you may have to append new features, update existing features etc. There are a couple of different options to accomplish this:

  • Method 1: editing individual features as updated datasets are available
  • Method 2: overwriting feature layers altogether with updated datasets

Depending on the number of features that are updated, your workflow requirements, you may adopt either or both kinds of update mechanisms.

In the sample Updating features in a feature layer we explore method 1. In this sample, we explore method 2.

# Import libraries
from arcgis.gis import GIS
from arcgis import features
import pandas as pd
# Connect to the GIS
# you could use your_online_profile or your_enterprise_profile to connect to GIS
gis = GIS(profile="your_enterprise_profile")

Introduction

Let us consider a scenario where we need to update a feature layer containing the capital cities of the US. We have 2 csv datasets simulating an update workflow as described below:

  1. capitals_1.csv -- contains the initial, incomplete dataset which is published as a feature layer
  2. capitals_2.csv -- contains additional points and updates to existing points, building on top of capitals_1.csv

Our goal is to update the features in the feature layer with the latest information contained in both the spreadsheets. We will accomplish this through the following steps

  1. Add capitals_1.csv as an item.
  2. Publish the csv as a feature layer. This simulates a typical scenario where a feature layer is published with initial set of data that is available.
  3. After updated information is available in capitals_2.csv, we will merge both spread sheets.
  4. Overwrite the feature layer using the new spread sheet file.

When you overwrite a feature layer, only the features get updated. All other information such as the feature layer's item id, comments, summary, description etc. remain the same. This way, any web maps or scenes that have this layer remains valid. Overwriting a feature layer also updates the related data item from which it was published. In this case, it will also update the csv data item with the updated spreadsheet file.

Note: Overwrite capability was introduced in ArcGIS Enterprise 10.5 and in ArcGIS Online. This capability is currently only available for feature layers. Further, ArcGIS sets some limits when overwriting feature layers:

  1. The name of the file that used to update in step 4 above should match the original file name of the item.
  2. The schema -- number of layers (applicable when your original file is a file geodatabase / shape file / service definition), and the name and number of attribute columns should remain the same as before.

The method 2 explained in this sample is much simpler compared to method 1 explained in Updating features in a feature layer. However, we cannot make use of the third spreadsheet which has the additional columns for our capitals. To do that, we would first update the features through overwriting, then edit the definition of the feature layer to add new columns and then edit each feature and add the appropriate column values, similar to that explained in method 1.

Publish the cities feature layer using the initial dataset

# if you've downloaded the samples or are working with your own organization
# or Enterprise, you could just read the 
# csv data directly instead of copyting the file:
#
# my_csv = os.path.join(data_path, csv_file)



import datetime as dt
import os
import shutil

# assign path variables for data
data_path = os.path.join('data', 'updating_gis_content')
csv_file = 'capitals_1.csv'

# assign variable to current timestamp to make unique file to add to portal
now = int(dt.datetime.now().timestamp())

csv_file_name = csv_file.split('.')[0] + '_' + str(now) + '.csv'

# copy original file for adding to portal
my_csv = shutil.copy2(os.path.join(data_path, csv_file),
                     os.path.join(data_path, csv_file_name))
my_csv
'data\\updating_gis_content\\capitals_1_1699945181.csv'
csv_file_name
'capitals_1_1699945181.csv'
# read the initial csv
cities_df_1 = pd.read_csv(my_csv)
cities_df_1.head()
city_idnamestatecapitalpop2000pop2007longitudelatitude
01HonoluluHIState371657378587-157.82343621.305782
12JuneauAKState3071131592-134.51158258.351418
23Boise CityIDState185787203529-116.23765543.613736
34OlympiaWAState2751445523-122.89307347.042418
45SalemORState136924152039-123.02915544.931109
# print the number of records in this csv
cities_df_1.shape
(19, 8)
# add the csv as an item
item_prop = {'title':'USA Capitals spreadsheet_' + str(now)}
csv_item = gis.content.add(item_properties=item_prop, data=my_csv)
csv_item
USA Capitals spreadsheet_1699945181
CSV by arcgis_python
Last Modified: November 13, 2023
0 comments, 0 views
csv_item.name
'capitals_1_1699945181.csv'
# publish the csv item into a feature layer
cities_item = csv_item.publish()
cities_item
USA Capitals spreadsheet_1699945181
Feature Layer Collection by arcgis_python
Last Modified: November 13, 2023
0 comments, 0 views
# update the item metadata
item_prop = {'title':'USA Capitals 2'}
cities_item.update(item_properties = item_prop, 
                   thumbnail=os.path.join('data','updating_gis_content','capital_cities.png'))
cities_item
USA Capitals 2
Feature Layer Collection by arcgis_python
Last Modified: November 13, 2023
0 comments, 0 views
map1 = gis.map('USA')
map1

overwrite-feature-layer-1.png

map1.zoom = 4
map1.center = [39, -98]
map1.add_layer(cities_item)

Merge updates from spreadsheet 2 with 1

The next set of updates have arrived and are stored in capitals_2.csv. We are told it contains corrections for the original set of features and also has new features.

Instead of applying the updates one at a time, we will merge both the spreadsheets into a new one.

# read the second csv set
csv2 = os.path.join('data', 'updating_gis_content', 'capitals_2.csv')
cities_df_2 = pd.read_csv(csv2)
cities_df_2.head(5)
city_idnamestatecapitalpop2000pop2007longitudelatitude
020Baton RougeLAState227818228810-91.14022730.458091
121HelenaMTState2578026007-112.02702746.595809
222BismarckNDState5553259344-100.77900046.813346
323PierreSDState1387614169-100.33638244.367964
424St. PaulMNState287151291643-93.11411844.954364
# get the dimensions of this csv
cities_df_2.shape
(36, 8)

Let us concatenate the spreadsheets 1 and 2 and store it in a DataFrame called updated_df. Note, this step introduces duplicate rows that were updated in spreadsheet 2.

dataframes = [cities_df_1, cities_df_2]

updated_df = pd.concat(dataframes)
updated_df.reset_index(drop=True)
updated_df.shape
(55, 8)

Next, we must drop the duplicate rows. Note, in this sample, the city_id column has unique values and is present in all spreadsheets. Thus, we are able to determine duplicate rows using this column and drop them.

updated_df.drop_duplicates(subset='city_id', keep='last', inplace=True)
# we specify argument keep = 'last' to retain edits from second spreadsheet
updated_df.shape
(51, 8)

Thus we have dropped 4 rows from spreadsheet 1 and retained the same 4 rows with updated values from spreadsheet 2. Let us see how the DataFrame looks so far:

updated_df.head(5)
city_idnamestatecapitalpop2000pop2007longitudelatitude
01HonoluluHIState371657378587-157.82343621.305782
12JuneauAKState3071131592-134.51158258.351418
23Boise CityIDState185787203529-116.23765543.613736
45SalemORState136924152039-123.02915544.931109
56CarsonNVState5245756641-119.75387339.160946

Write the updates to disk

Let us create a new folder called updated_capitals_csv and write the updated features to a csv with the same name as our first csv file.

import os
if not os.path.exists(os.path.join('data', 'updating_gis_content','updated_capitals_csv')):
    os.mkdir(os.path.join('data', 'updating_gis_content','updated_capitals_csv'))
updated_df.to_csv(os.path.join('data', 'updating_gis_content',
                               'updated_capitals_csv', csv_file_name))

Overwrite the feature layer

Let us overwrite the feature layer using the new csv file we just created. To overwrite, we will use the overwrite() method.

from arcgis.features import FeatureLayerCollection
cities_flayer_collection = FeatureLayerCollection.fromitem(cities_item)
cities_flayer_collection.properties.layers[0].name
'capitals_1_1699945181'

csv_file_name is the original file name of the item csv_item, hence we'll use the path to csv_file_name to overwrite the feature layer which was created using csv_item

#call the overwrite() method which can be accessed using the manager property
cities_flayer_collection.manager.overwrite(os.path.join('data', 'updating_gis_content',
                               'updated_capitals_csv', csv_file_name))
{'success': True}

Access the overwritten feature layer

Let us query the feature layer and verify the number of features has increased to 51.

cities_flayer = cities_item.layers[0] #there is only 1 layer
cities_flayer.query(return_count_only=True) #get the total number of features
51

Let us draw this new layer in map

map2 = gis.map("USA")
map2

overwrite-feature-layer-2.png

map2.zoom = 4
map2.center = [39, -98]
map2.add_layer(cities_item)

As seen from the map, the number of features has increased while the attribute columns remain the same as original.

Conclusion

Thus, in this sample, we observed how update a feature layer by overwriting it with new content. This method is a lot simpler than method 1 explained in Updating features in a feature layer sample. However, with this simplicity, we compromise on our ability to add new columns or change the schema of the feature layer during the update. Further, if your feature layer was updated after it was published, then those updates get overwritten when you perform the overwrite operation. To retain those edits, extract the data from the feature layer, merge your updates with this extract, then overwrite the feature layer.

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.