Part 1 - Introduction to GeoEnrichment¶
Introduction¶
GeoEnrichment adds location intelligence to the data by providing facts about a location or an area. Using GeoEnrichment, you can get information about the people and places in a specific area or within a certain distance or drive time from a location. It enables you to query and use information from a large collection of data sets including population, income, housing, consumer behavior, and the natural environment. GeoEnrichment enables you to answer questions about locations that you can't answer with maps alone. For example: What kind of people live here? What do people like to do in this area? What are their habits and lifestyles?
GeoEnrichment makes your analysis more powerful by adding global demographic, spending, lifestyle or business features at different geographical levels such as city, county, region, state and country. Demographic features (Population, Age, Education etc.) and Socio-economic features (Income, Education, Wealth etc.) can be easily added to your location data, making it more intelligent. Feature Engineering is one of the key aspects of any Data Science project as it involves adding new features to the data to increase the predictive power of a learning algorithm. With GeoEnrichment, you can quickly add more features to your location data, helping your algorithms make better predictions.
To understand how GeoEnrichment adds value, let's imagine that a retail giant is evaluating potential sites to open new stores where the conditions for evaluation include competition, traffic, economic feasibility and market potential of different geographic areas. With GeoEnrichment, they can dig into at an average shoppers' lifestyle, income, spending, education and other socio-demographic factors for different neighborhoods to understand their potential customers and make an educated decision when choosing new sites.
Getting Started¶
A user must be logged on to a GIS in order to use GeoEnrichment. Geoenrichment functionality is available in the arcgis.geoenrichment
module.
To enable GeoEnrichment, an ArcGIS Online subscription is needed or ArcGIS Enterprise needs to be configured with GeoEnrichment utility service. GeoEnrichment operations consume credits. Credits are the currency used across ArcGIS and are consumed for specific transactions. Learn more about credit consumption for GeoEnrichment here.
Object Model Diagram¶
The picture below illustrates how geoenrichment
module is organized.
Ways to enrich your data¶
You can enrich your data in 2 ways:
enrich()
method fromarcgis.geoenrichment
module.- enrich_layer( ) method from the features module.
The enrich()
method returns a Spatiallly Enabled Data Frame. This data frame can be saved as a new feature layer Item in your GIS and used for analysis or visualization on a map. However, if you would like to enrich an existing FeatureLayer, then use the enrich_layer()
method from the arcgis.features
module. The result will be a new layer of input features that includes enriched data.
Quick Example¶
Let's look at a simple example of GeoEnrichment
in action. Suppose a company wants to open a healthcare facility somewhere in Los Angeles, CA. They have a sample dataset of existing healthcare providers with their address details for the target areas (represented by their zip codes). The company wants to understand the demographics of each zip code to make the right decision.
Let's import this data and make it richer with GeoEnrichment
.
# Import Libraries
import pandas as pd
from arcgis.gis import GIS
from arcgis.geoenrichment import Country
# Create a GIS Connection
gis = GIS(profile='your_online_profile')
# Read the data
df = pd.read_csv('../data/health.csv')
df
This dataset shows 5 providers with their address details. The providers are located in Zip Codes 90018, 90023 and 90035.
Let's enrich
this dataset with socio-demographic factors such as Total Population, Median Age, Median Household Income, Diversity Index, Education
for each zip code to better understand these areas.
# Define Analysis variables
analysis_variables = [
'TOTPOP_CY', # Population: Total Population (Esri)
'DIVINDX_CY', # Diversity Index (Esri)
'AVGHHSZ_CY', # Average Household Size (Esri)
'MEDAGE_CY', # Age: Median Age (Esri)
'MEDHINC_CY', # Income: Median Household Income (Esri)
'BACHDEG_CY', # Education: Bachelor's Degree (Esri)
]
# Get enriched data for each zip code
from arcgis.geoenrichment import *
usa = Country.get('US')
zip1 = usa.subgeographies.states['California'].zip5['90018']
zip2 = usa.subgeographies.states['California'].zip5['90023']
zip3 = usa.subgeographies.states['California'].zip5['90035']
enrich_df = enrich(study_areas=[zip1, zip2, zip3], analysis_variables=analysis_variables)
enrich_df
# Merge provider data with GeoEnrichment data
df['Zip Code'] = df['Zip Code'].apply(str)
merged = pd.merge(enrich_df, df, left_on='StdGeographyID',right_on='Zip Code')
merged.iloc[:,-10:]
Visualize on a map
Let's visualize the 3 zip codes on a map.
map1 = gis.map('Los Angeles, CA',12)
map1