Part 2 - Where to enrich? (what are study areas?)¶
Enriching Study Areas¶
GeoEnrichment uses the concept of a study area to define the location of the point or area that you want to enrich with additional information or create reports about. The accepted forms of study areas are:
- Street address locations
- a. Single line input
- b. Multiple field input
- Point, line and polygon geometries
- Buffered study areas
- Named statistical areas
Before we look at the exmaples of study areas, let's understand the concept of Data collections and analysis variables. We will look at Data collections in detail in a later section.
Data collections and analysis variables¶
GeoEnrichment uses the concept of a data collection to define the data attributes (analysis variables) returned by the enrichment service. A data collection is a preassembled list of attributes that will be used to enrich the input features. Collection attributes can describe various types of information, such as demographic characteristics and geographic context of the locations or areas submitted as input features. We will introduce the concept of data collections here and look at the details in the next guide.
The Country
class can be used to discover the data collections, sub-geographies and available reports for a country. When working with a particular country, you will find it convenient to get a reference to it using the Country.get() method.
The data_collections
property of a Country
object lists a combination of available data collections and analysis variables for each data collection as a Pandas dataframe.
Once we know the data collection we would like to use, we can look at all the unique analysisVariable
available in that data collection.
# Import Libraries
from arcgis.gis import GIS
from arcgis.geoenrichment import Country, enrich, BufferStudyArea
# Create a GIS Connection
gis = GIS(profile='your_online_profile')
# Get US as a country
usa = Country.get('US')
type(usa)
df = usa.data_collections
# print a few rows of the DataFrame
df.head()
# call the shape property to get the total number of rows and columns
df.shape
Each data collection can have multiple analysis variables as seen in the table above. Every such analysis variable has a unique ID, found in the analysisVariable
column. When calling the enrich()
method, these analysis variables can be passed in the data_collections
and analysis_variables
parameters.
You can filter the data_collections
and query the collections analysis_variables
using Pandas expressions.
# get all the unique data collections available for the current country
df.index.unique()
The snippet below shows how you can query the Age
data collection and get all the unique analysisVariable
s under that collection.
df.loc['Age']['analysisVariable'].unique()
# View a sample of the `Age` data collection
df.loc['Age'].head()
Now, let's look at some examples of enriching each of the study areas.
Enriching street address¶
Street address locations can be passed as strings of input street addresses, points of interest or place names. A street address can be passed as a single line or as a multiple field input. If a point (e.g. a street address) is used as a study area, the service will create a 1 mile ring buffer around the point to collect and append enrichment data.
The example below uses a street address
as a study area for enrichment using Age
data collection.
Single line address¶
# Enriching single address as single line imput
single_address = enrich(study_areas=["380 New York St Redlands CA 92373"],
data_collections=['Age'])
single_address
Visualize results on a map¶
The returned spatial dataframe can be visualized on a map as shown below:
# Plot on a map
address_map = gis.map('Redlands, CA',13)
address_map
A buffer of 1 mile is created by default, as seen on this map, for any address.
single_address.spatial.plot(address_map)
Multiple addresses as single line input
# Enriching multiple addresses as single line input
enrich(study_areas=[{"address":{"text":"12 Concorde Place Toronto ON M3C 3R8","sourceCountry":"Canada"}},
{"address":{"text":"380 New York St Redlands CA 92373","sourceCountry":"US"}}],
data_collections=['Age'])
Multiple field input¶
enrich(study_areas=[{"address":{"Address":"380 New York Street",
"City":"Redlands", "Region":"CA", "Postal":92373}}],
data_collections=['Age'])
Enriching with various analysis variables for age such as FEM45, FEM50, FEM65
etc.
enrich(study_areas=["380 New York St Redlands CA 92373"],
analysis_variables=["Age.FEM45","Age.FEM55","Age.FEM65"])
Enriching point, line and polygon geometries¶
Point geometries can be passed as x and y coordinates to study_areas
parameter. When points are specified as study areas, the service will analyze map areas surrounding or associated with the input point locations. Unless otherwise specified, the service will analyze a one mile ring around a point. This is also true for a line. Locations can also be given as polygon geometries.
Single Point described as map coordinates¶
from arcgis.geometry import Point
pt = Point({"x" : -117.1956, "y" : 34.0572, "spatialReference" : {"wkid" : 4326}})
enrich(study_areas=[pt], data_collections=['Age'])
Multiple points with attributes described as map coordinates¶
pt1 = Point({"x" : -122.435, "y" : 37.785, "spatialReference" : {"wkid" : 4326}})
pt2 = Point({"x" : -122.433, "y" : 37.734, "spatialReference" : {"wkid" : 4326}})
enrich(study_areas=[pt1, pt2], data_collections=['Age'])
Line feature described as geometry¶
from arcgis.geometry import Polyline
line = Polyline({"paths":[[[-13048580,4036370],[-13046151,4036366]]],
"spatialReference":{"wkid":102100}})
enriched_line_df = enrich(study_areas=[line], data_collections=['Age'])
enriched_line_df
Visualize results on a map¶
The returned spatial dataframe can be visualized on a map as shown below:
# Plot on a map
line_map = gis.map('Redlands, CA',13)
line_map