Skip To Content ArcGIS for Developers Sign In Dashboard

ArcGIS API for Python

Part 2 - Where to enrich? (what are study areas?)

Enriching Study Areas

GeoEnrichment uses the concept of a study area to define the location of the point or area that you want to enrich with additional information or create reports about. The accepted forms of study areas are:

  1. Street address locations
    • a. Single line input
    • b. Multiple field input
  2. Point, line and polygon geometries
  3. Buffered study areas
  4. Named statistical areas

Before we look at the exmaples of study areas, let's understand the concept of Data collections and analysis variables. We will look at Data collections in detail in a later section.

Data collections and analysis variables

GeoEnrichment uses the concept of a data collection to define the data attributes (analysis variables) returned by the enrichment service. A data collection is a preassembled list of attributes that will be used to enrich the input features. Collection attributes can describe various types of information, such as demographic characteristics and geographic context of the locations or areas submitted as input features. We will introduce the concept of data collections here and look at the details in the next guide.

The Country class can be used to discover the data collections, sub-geographies and available reports for a country. When working with a particular country, you will find it convenient to get a reference to it using the Country.get() method.

The data_collections property of a Country object lists a combination of available data collections and analysis variables for each data collection as a Pandas dataframe.

Once we know the data collection we would like to use, we can look at all the unique analysisVariable available in that data collection.

In [13]:
# Import Libraries
from arcgis.gis import GIS
from arcgis.geoenrichment import Country, enrich
In [15]:
# Create a GIS Connection
gis = GIS(profile='your_online_profile')
In [16]:
# Get US as a country
usa = Country.get('US')
type(usa)
Out[16]:
arcgis.geoenrichment.enrichment.Country
In [17]:
df = usa.data_collections

# print a few rows of the DataFrame
df.head()
Out[17]:
analysisVariable alias fieldCategory vintage
dataCollectionID
1yearincrements 1yearincrements.AGE0_CY 2020 Population Age <1 2020 Age: 1 Year Increments (Esri) 2020
1yearincrements 1yearincrements.AGE1_CY 2020 Population Age 1 2020 Age: 1 Year Increments (Esri) 2020
1yearincrements 1yearincrements.AGE2_CY 2020 Population Age 2 2020 Age: 1 Year Increments (Esri) 2020
1yearincrements 1yearincrements.AGE3_CY 2020 Population Age 3 2020 Age: 1 Year Increments (Esri) 2020
1yearincrements 1yearincrements.AGE4_CY 2020 Population Age 4 2020 Age: 1 Year Increments (Esri) 2020
In [18]:
# call the shape property to get the total number of rows and columns
df.shape
Out[18]:
(17608, 4)

Each data collection can have multiple analysis variables as seen in the table above. Every such analysis variable has a unique ID, found in the analysisVariable column. When calling the enrich() method, these analysis variables can be passed in the data_collections and analysis_variables parameters.

You can filter the data_collections and query the collections analysis_variables using Pandas expressions.

In [24]:
# get all the unique data collections available for the current country
df.index.unique()
Out[24]:
Index(['1yearincrements', '5yearincrements', 'ACS_Housing_Summary_rep',
       'ACS_Population_Summary_rep', 'Age', 'AgeDependency',
       'Age_50_Profile_rep', 'Age_by_Sex_Profile_rep',
       'Age_by_Sex_by_Race_Profile_rep', 'AtRisk',
       ...
       'transportation', 'travelMPI', 'unitsinstructure',
       'urbanizationgroupsNEW', 'vacant', 'vehiclesavailable', 'veterans',
       'women', 'yearbuilt', 'yearmovedin'],
      dtype='object', name='dataCollectionID', length=150)

The snippet below shows how you can query the Age data collection and get all the unique analysisVariables under that collection.

In [25]:
df.loc['Age']['analysisVariable'].unique()
Out[25]:
array(['Age.MALE0', 'Age.MALE5', 'Age.MALE10', 'Age.MALE15', 'Age.MALE20',
       'Age.MALE25', 'Age.MALE30', 'Age.MALE35', 'Age.MALE40',
       'Age.MALE45', 'Age.MALE50', 'Age.MALE55', 'Age.MALE60',
       'Age.MALE65', 'Age.MALE70', 'Age.MALE75', 'Age.MALE80',
       'Age.MALE85', 'Age.FEM0', 'Age.FEM5', 'Age.FEM10', 'Age.FEM15',
       'Age.FEM20', 'Age.FEM25', 'Age.FEM30', 'Age.FEM35', 'Age.FEM40',
       'Age.FEM45', 'Age.FEM50', 'Age.FEM55', 'Age.FEM60', 'Age.FEM65',
       'Age.FEM70', 'Age.FEM75', 'Age.FEM80', 'Age.FEM85'], dtype=object)
In [26]:
# View a sample of the `Age` data collection
df.loc['Age'].head()
Out[26]:
analysisVariable alias fieldCategory vintage
dataCollectionID
Age Age.MALE0 2020 Males Age 0-4 2020 Age: 5 Year Increments (Esri) 2020
Age Age.MALE5 2020 Males Age 5-9 2020 Age: 5 Year Increments (Esri) 2020
Age Age.MALE10 2020 Males Age 10-14 2020 Age: 5 Year Increments (Esri) 2020
Age Age.MALE15 2020 Males Age 15-19 2020 Age: 5 Year Increments (Esri) 2020
Age Age.MALE20 2020 Males Age 20-24 2020 Age: 5 Year Increments (Esri) 2020

Now, let's look at some examples of enriching each of the study areas.

Enriching street address

Street address locations can be passed as strings of input street addresses, points of interest or place names. A street address can be passed as a single line or as a multiple field input. If a point (e.g. a street address) is used as a study area, the service will create a 1 mile ring buffer around the point to collect and append enrichment data.

The example below uses a street address as a study area for enrichment using Age data collection.

Single line address

In [27]:
# Enriching single address as single line imput
single_address = enrich(study_areas=["380 New York St Redlands CA 92373"], 
                       data_collections=['Age'])
In [28]:
single_address
Out[28]:
ID OBJECTID sourceCountry X Y areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US -117.194872 34.057237 RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups ... 376 398 374 340 310 262 153 98 129 {"rings": [[[-117.19487199429183, 34.071745616...

1 rows × 50 columns

Visualize results on a map

The returned spatial dataframe can be visualized on a map as shown below:

In [15]:
# Plot on a map
address_map = gis.map('Redlands, CA',13)
address_map

A buffer of 1 mile is created by default, as seen on this map, for any address.

In [14]:
single_address.spatial.plot(address_map)
Out[14]:
True

Multiple addresses as single line input

In [29]:
# Enriching multiple addresses as single line input
enrich(study_areas=[{"address":{"text":"12 Concorde Place Toronto ON M3C 3R8","sourceCountry":"Canada"}},
                    {"address":{"text":"380 New York St Redlands CA 92373","sourceCountry":"US"}}], 
       data_collections=['Age'])
Out[29]:
ID OBJECTID sourceCountry X Y areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod ... ECYPFA4549 ECYPFA5054 ECYPFA5559 ECYPFA6064 ECYPFA6569 ECYPFA7074 ECYPFA7579 ECYPFA8084 ECYPFA85P SHAPE
0 0 1 CA -79.328740 43.729720 RingBuffer esriMiles Miles 1 BlockApportionment:CAN.DA ... 1351.0 1264.0 1323.0 1138.0 1156.0 973.0 784.0 576.0 970.0 {"rings": [[[-79.3287400246266, 43.74420464321...
1 1 2 US -117.194872 34.057237 RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups ... NaN NaN NaN NaN NaN NaN NaN NaN NaN {"rings": [[[-117.19487199429183, 34.071745616...

2 rows × 50 columns

Multiple field input

In [30]:
enrich(study_areas=[{"address":{"Address":"380 New York Street", 
                                "City":"Redlands", "Region":"CA", "Postal":92373}}], 
       data_collections=['Age'])
Out[30]:
ID OBJECTID sourceCountry X Y areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US -117.194872 34.057237 RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups ... 376 398 374 340 310 262 153 98 129 {"rings": [[[-117.19487199429183, 34.071745616...

1 rows × 50 columns

Enriching with various analysis variables for age such as FEM45, FEM50, FEM65 etc.

In [31]:
enrich(study_areas=["380 New York St Redlands CA 92373"], 
       analysis_variables=["Age.FEM45","Age.FEM55","Age.FEM65"])
Out[31]:
ID OBJECTID sourceCountry X Y areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod populationToPolygonSizeRating apportionmentConfidence HasData FEM45 FEM55 FEM65 SHAPE
0 0 1 US -117.194872 34.057237 RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups 2.191 2.576 1 376 374 310 {"rings": [[[-117.19487199429183, 34.071745616...

Enriching point, line and polygon geometries

Point geometries can be passed as x and y coordinates to study_areas parameter. When points are specified as study areas, the service will analyze map areas surrounding or associated with the input point locations. Unless otherwise specified, the service will analyze a one mile ring around a point. This is also true for a line. Locations can also be given as polygon geometries.

Single Point described as map coordinates

In [32]:
from arcgis.geometry import Point
In [33]:
pt = Point({"x" : -117.1956, "y" : 34.0572, "spatialReference" : {"wkid" : 4326}})
enrich(study_areas=[pt], data_collections=['Age'])
Out[33]:
ID OBJECTID sourceCountry areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod populationToPolygonSizeRating apportionmentConfidence ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups 2.191 2.576 ... 364 388 361 329 300 253 147 92 122 {"rings": [[[-117.19559999999998, 34.071708616...

1 rows × 48 columns

Multiple points with attributes described as map coordinates

In [34]:
pt1 = Point({"x" : -122.435, "y" : 37.785, "spatialReference" : {"wkid" : 4326}})
pt2 = Point({"x" : -122.433, "y" : 37.734, "spatialReference" : {"wkid" : 4326}})

enrich(study_areas=[pt1, pt2], data_collections=['Age'])
Out[34]:
ID OBJECTID sourceCountry areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod populationToPolygonSizeRating apportionmentConfidence ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups 2.191 2.576 ... 2994 2581 2615 2773 2602 2394 1926 1564 2351 {"rings": [[[-122.43499999999999, 37.799499596...
1 1 2 US RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups 2.191 2.576 ... 2444 2373 2378 2164 2004 1660 1097 798 1040 {"rings": [[[-122.43299999999999, 37.748499722...

2 rows × 48 columns

Line feature described as geometry

In [35]:
from arcgis.geometry import Polyline
In [36]:
line = Polyline({"paths":[[[-13048580,4036370],[-13046151,4036366]]],
                 "spatialReference":{"wkid":102100}})
enriched_line_df = enrich(study_areas=[line], data_collections=['Age'])
In [37]:
enriched_line_df
Out[37]:
ID OBJECTID sourceCountry areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod populationToPolygonSizeRating apportionmentConfidence ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups 2.191 2.576 ... 585 585 528 498 443 389 227 151 228 {"rings": [[[-117.21736177272676, 34.070851408...

1 rows × 48 columns

Visualize results on a map

The returned spatial dataframe can be visualized on a map as shown below:

In [27]:
# Plot on a map
line_map = gis.map('Redlands, CA',13)
line_map