Overwriting feature layers

As content publishers, you may be required to keep certain web layers up to date. As new data arrives, you may have to append new features, update existing features etc. There are a couple of different options to accomplish this:

Method 1: editing individual features as updated datasets are available
Method 2: overwriting feature layers altogether with updated datasets

Depending on the number of features that are updated, your workflow requirements, you may adopt either or both kinds of update mechanisms.

In the sample Updating features in a feature layer we explore method 1. In this sample, we explore method 2.

# Import libraries
from arcgis.gis import GIS
from arcgis import features
import pandas as pd

# Connect to the GIS
# you could use your_online_profile or your_enterprise_profile to connect to GIS
gis = GIS(profile="your_enterprise_profile")

Introduction

Let us consider a scenario where we need to update a feature layer containing the capital cities of the US. We have 2 csv datasets simulating an update workflow as described below:

capitals_1.csv -- contains the initial, incomplete dataset which is published as a feature layer
capitals_2.csv -- contains additional points and updates to existing points, building on top of capitals_1.csv

Our goal is to update the features in the feature layer with the latest information contained in both the spreadsheets. We will accomplish this through the following steps

Add capitals_1.csv as an item.
Publish the csv as a feature layer. This simulates a typical scenario where a feature layer is published with initial set of data that is available.
After updated information is available in capitals_2.csv, we will merge both spread sheets.
Overwrite the feature layer using the new spread sheet file.

When you overwrite a feature layer, only the features get updated. All other information such as the feature layer's item id, comments, summary, description etc. remain the same. This way, any web maps or scenes that have this layer remains valid. Overwriting a feature layer also updates the related data item from which it was published. In this case, it will also update the csv data item with the updated spreadsheet file.

Note: Overwrite capability was introduced in ArcGIS Enterprise 10.5 and in ArcGIS Online. This capability is currently only available for feature layers. Further, ArcGIS sets some limits when overwriting feature layers:

The name of the file that used to update in step 4 above should match the original file name of the item.
The schema -- number of layers (applicable when your original file is a file geodatabase / shape file / service definition), and the name and number of attribute columns should remain the same as before.

The method 2 explained in this sample is much simpler compared to method 1 explained in Updating features in a feature layer. However, we cannot make use of the third spreadsheet which has the additional columns for our capitals. To do that, we would first update the features through overwriting, then edit the definition of the feature layer to add new columns and then edit each feature and add the appropriate column values, similar to that explained in method 1.

Publish the cities feature layer using the initial dataset

# if you've downloaded the samples or are working with your own organization
# or Enterprise, you could just read the 
# csv data directly instead of copyting the file:
#
# my_csv = os.path.join(data_path, csv_file)



import datetime as dt
import os
import shutil

# assign path variables for data
data_path = os.path.join('data', 'updating_gis_content')
csv_file = 'capitals_1.csv'

# assign variable to current timestamp to make unique file to add to portal
now = int(dt.datetime.now().timestamp())

csv_file_name = csv_file.split('.')[0] + '_' + str(now) + '.csv'

# copy original file for adding to portal
my_csv = shutil.copy2(os.path.join(data_path, csv_file),
                     os.path.join(data_path, csv_file_name))

my_csv

'data\\updating_gis_content\\capitals_1_1699945181.csv'

csv_file_name

'capitals_1_1699945181.csv'

# read the initial csv
cities_df_1 = pd.read_csv(my_csv)
cities_df_1.head()

	city_id	name	state	capital	pop2000	pop2007	longitude	latitude
0	1	Honolulu	HI	State	371657	378587	-157.823436	21.305782
1	2	Juneau	AK	State	30711	31592	-134.511582	58.351418
2	3	Boise City	ID	State	185787	203529	-116.237655	43.613736
3	4	Olympia	WA	State	27514	45523	-122.893073	47.042418
4	5	Salem	OR	State	136924	152039	-123.029155	44.931109

# print the number of records in this csv
cities_df_1.shape

(19, 8)

# add the csv as an item
item_prop = {'title':'USA Capitals spreadsheet_' + str(now)}
csv_item = gis.content.add(item_properties=item_prop, data=my_csv)
csv_item

USA Capitals spreadsheet_1699945181

CSV by arcgis_python
Last Modified: November 13, 2023
0 comments, 0 views

csv_item.name

'capitals_1_1699945181.csv'

# publish the csv item into a feature layer
cities_item = csv_item.publish()
cities_item

USA Capitals spreadsheet_1699945181

Feature Layer Collection by arcgis_python
Last Modified: November 13, 2023
0 comments, 0 views

# update the item metadata
item_prop = {'title':'USA Capitals 2'}
cities_item.update(item_properties = item_prop, 
                   thumbnail=os.path.join('data','updating_gis_content','capital_cities.png'))
cities_item

USA Capitals 2

Feature Layer Collection by arcgis_python
Last Modified: November 13, 2023
0 comments, 0 views

map1 = gis.map('USA')
map1

map1.zoom = 4
map1.center = [39, -98]

map1.add_layer(cities_item)

Merge updates from spreadsheet 2 with 1

The next set of updates have arrived and are stored in capitals_2.csv. We are told it contains corrections for the original set of features and also has new features.

Instead of applying the updates one at a time, we will merge both the spreadsheets into a new one.

# read the second csv set
csv2 = os.path.join('data', 'updating_gis_content', 'capitals_2.csv')
cities_df_2 = pd.read_csv(csv2)
cities_df_2.head(5)

	city_id	name	state	capital	pop2000	pop2007	longitude	latitude
0	20	Baton Rouge	LA	State	227818	228810	-91.140227	30.458091
1	21	Helena	MT	State	25780	26007	-112.027027	46.595809
2	22	Bismarck	ND	State	55532	59344	-100.779000	46.813346
3	23	Pierre	SD	State	13876	14169	-100.336382	44.367964
4	24	St. Paul	MN	State	287151	291643	-93.114118	44.954364

# get the dimensions of this csv
cities_df_2.shape

(36, 8)

Let us concatenate the spreadsheets 1 and 2 and store it in a DataFrame called updated_df. Note, this step introduces duplicate rows that were updated in spreadsheet 2.

dataframes = [cities_df_1, cities_df_2]

updated_df = pd.concat(dataframes)
updated_df.reset_index(drop=True)
updated_df.shape

(55, 8)

Next, we must drop the duplicate rows. Note, in this sample, the city_id column has unique values and is present in all spreadsheets. Thus, we are able to determine duplicate rows using this column and drop them.

updated_df.drop_duplicates(subset='city_id', keep='last', inplace=True)
# we specify argument keep = 'last' to retain edits from second spreadsheet
updated_df.shape

(51, 8)

Thus we have dropped 4 rows from spreadsheet 1 and retained the same 4 rows with updated values from spreadsheet 2. Let us see how the DataFrame looks so far:

updated_df.head(5)

	city_id	name	state	capital	pop2000	pop2007	longitude	latitude
0	1	Honolulu	HI	State	371657	378587	-157.823436	21.305782
1	2	Juneau	AK	State	30711	31592	-134.511582	58.351418
2	3	Boise City	ID	State	185787	203529	-116.237655	43.613736
4	5	Salem	OR	State	136924	152039	-123.029155	44.931109
5	6	Carson	NV	State	52457	56641	-119.753873	39.160946

Write the updates to disk

Let us create a new folder called updated_capitals_csv and write the updated features to a csv with the same name as our first csv file.

import os
if not os.path.exists(os.path.join('data', 'updating_gis_content','updated_capitals_csv')):
    os.mkdir(os.path.join('data', 'updating_gis_content','updated_capitals_csv'))

updated_df.to_csv(os.path.join('data', 'updating_gis_content',
                               'updated_capitals_csv', csv_file_name))

Overwrite the feature layer

Let us overwrite the feature layer using the new csv file we just created. To overwrite, we will use the overwrite() method.

from arcgis.features import FeatureLayerCollection
cities_flayer_collection = FeatureLayerCollection.fromitem(cities_item)

cities_flayer_collection.properties.layers[0].name

'capitals_1_1699945181'

csv_file_name is the original file name of the item csv_item, hence we'll use the path to csv_file_name to overwrite the feature layer which was created using csv_item

#call the overwrite() method which can be accessed using the manager property
cities_flayer_collection.manager.overwrite(os.path.join('data', 'updating_gis_content',
                               'updated_capitals_csv', csv_file_name))

{'success': True}

Access the overwritten feature layer

Let us query the feature layer and verify the number of features has increased to 51.

cities_flayer = cities_item.layers[0] #there is only 1 layer
cities_flayer.query(return_count_only=True) #get the total number of features

Let us draw this new layer in map

map2 = gis.map("USA")
map2

map2.zoom = 4
map2.center = [39, -98]

map2.add_layer(cities_item)

As seen from the map, the number of features has increased while the attribute columns remain the same as original.

Conclusion

Thus, in this sample, we observed how update a feature layer by overwriting it with new content. This method is a lot simpler than method 1 explained in Updating features in a feature layer sample. However, with this simplicity, we compromise on our ability to add new columns or change the schema of the feature layer during the update. Further, if your feature layer was updated after it was published, then those updates get overwritten when you perform the overwrite operation. To retain those edits, extract the data from the feature layer, merge your updates with this extract, then overwrite the feature layer.