Skip To Content ArcGIS for Developers Sign In Dashboard

ArcGIS API for Python

Part 1 - Introduction to GeoEnrichment

Introduction

GeoEnrichment adds location intelligence to the data by providing facts about a location or an area. Using GeoEnrichment, you can get information about the people and places in a specific area or within a certain distance or drive time from a location. It enables you to query and use information from a large collection of data sets including population, income, housing, consumer behavior, and the natural environment. GeoEnrichment enables you to answer questions about locations that you can't answer with maps alone. For example: What kind of people live here? What do people like to do in this area? What are their habits and lifestyles?

GeoEnrichment makes your analysis more powerful by adding global demographic, spending, lifestyle or business features at different geographical levels such as city, county, region, state and country. Demographic features (Population, Age, Education etc.) and Socio-economic features (Income, Education, Wealth etc.) can be easily added to your location data, making it more intelligent. Feature Engineering is one of the key aspects of any Data Science project as it involves adding new features to the data to increase the predictive power of a learning algorithm. With GeoEnrichment, you can quickly add more features to your location data, helping your algorithms make better predictions.

To understand how GeoEnrichment adds value, let's imagine that a retail giant is evaluating potential sites to open new stores where the conditions for evaluation include competition, traffic, economic feasibility and market potential of different geographic areas. With GeoEnrichment, they can dig into at an average shoppers' lifestyle, income, spending, education and other socio-demographic factors for different neighborhoods to understand their potential customers and make an educated decision when choosing new sites.

Getting Started

A user must be logged on to a GIS in order to use GeoEnrichment. Geoenrichment functionality is available in the arcgis.geoenrichment module.

To enable GeoEnrichment, an ArcGIS Online subscription is needed or ArcGIS Enterprise needs to be configured with GeoEnrichment utility service. GeoEnrichment operations consume credits. Credits are the currency used across ArcGIS and are consumed for specific transactions. Learn more about credit consumption for GeoEnrichment here.

Object Model Diagram

The picture below illustrates how geoenrichment module is organized.

Ways to enrich your data

You can enrich your data in 2 ways:

  1. enrich() method from arcgis.geoenrichment module.
  2. enrich_layer( ) method from the features module.

The enrich() method returns a Spatiallly Enabled Data Frame. This data frame can be saved as a new feature layer Item in your GIS and used for analysis or visualization on a map. However, if you would like to enrich an existing FeatureLayer, then use the enrich_layer() method from the arcgis.features module. The result will be a new layer of input features that includes enriched data.

Quick Example

Let's look at a simple example of GeoEnrichment in action. Suppose a company wants to open a healthcare facility somewhere in Los Angeles, CA. They have a sample dataset of existing healthcare providers with their address details for the target areas (represented by their zip codes). The company wants to understand the demographics of each zip code to make the right decision.

Let's import this data and make it richer with GeoEnrichment.

In [1]:
# Import Libraries
import pandas as pd
from arcgis.gis import GIS
from arcgis.geoenrichment import Country
In [3]:
# Create a GIS Connection
gis = GIS(profile='your_online_profile')
In [5]:
# Read the data
df = pd.read_csv('../data/health.csv')
df
Out[5]:
Number of Beds Name Address City State Zip Code
0 156 Facility 1 2468 SOUTH ST ANDREWS PLACE LOS ANGELES CA 90018
1 59 Facility 2 2300 W. WASHINGTON BLVD. LOS ANGELES CA 90018
2 25 Facility 3 4060 E. WHITTIER BLVD. LOS ANGELES CA 90023
3 49 Facility 4 6070 W. PICO BOULEVARD LOS ANGELES CA 90035
4 55 Facility 5 1480 S. LA CIENEGA BL LOS ANGELES CA 90035

This dataset shows 5 providers with their address details. The providers are located in Zip Codes 90018, 90023 and 90035.

Let's enrich this dataset with socio-demographic factors such as Total Population, Median Age, Median Household Income, Diversity Index, Education for each zip code to better understand these areas.

In [6]:
# Define Analysis variables
analysis_variables = [
    'TOTPOP_CY',  # Population: Total Population (Esri)
    'DIVINDX_CY', # Diversity Index (Esri)
    'AVGHHSZ_CY', # Average Household Size (Esri)
    'MEDAGE_CY',  # Age: Median Age (Esri)
    'MEDHINC_CY', # Income: Median Household Income (Esri)
    'BACHDEG_CY', # Education: Bachelor's Degree (Esri)
]
In [7]:
# Get enriched data for each zip code
from arcgis.geoenrichment import *

usa = Country.get('US')
zip1 = usa.subgeographies.states['California'].zip5['90018']
zip2 = usa.subgeographies.states['California'].zip5['90023']
zip3 = usa.subgeographies.states['California'].zip5['90035']

enrich_df = enrich(study_areas=[zip1, zip2, zip3], analysis_variables=analysis_variables)

enrich_df
Out[7]:
ID OBJECTID StdGeographyLevel StdGeographyName StdGeographyID sourceCountry aggregationMethod populationToPolygonSizeRating apportionmentConfidence HasData TOTPOP_CY DIVINDX_CY AVGHHSZ_CY MEDAGE_CY MEDHINC_CY BACHDEG_CY SHAPE
0 0 1 US.ZIP5 Los Angeles 90018 US Query:US.ZIP5 2.191 2.576 1 52420 91.6 3.16 34.0 42741 4996 {"rings": [[[-118.30899000030098, 34.039920000...
1 1 2 US.ZIP5 Los Angeles 90023 US Query:US.ZIP5 2.191 2.576 1 48673 76.9 4.29 29.1 43056 1367 {"rings": [[[-118.2062542231383, 34.0348220536...
2 2 3 US.ZIP5 Los Angeles 90035 US Query:US.ZIP5 2.191 2.576 1 30187 59.0 2.25 38.8 88405 7848 {"rings": [[[-118.37619999961632, 34.059440000...
In [9]:
# Merge provider data with GeoEnrichment data
df['Zip Code'] = df['Zip Code'].apply(str)
merged = pd.merge(enrich_df, df, left_on='StdGeographyID',right_on='Zip Code')
In [13]:
merged.iloc[:,-10:]
Out[13]:
MEDAGE_CY MEDHINC_CY BACHDEG_CY SHAPE Number of Beds Name Address City State Zip Code
0 34.0 42741 4996 {'rings': [[[-118.30899000030098, 34.039920000... 156 Facility 1 2468 SOUTH ST ANDREWS PLACE LOS ANGELES CA 90018
1 34.0 42741 4996 {'rings': [[[-118.30899000030098, 34.039920000... 59 Facility 2 2300 W. WASHINGTON BLVD. LOS ANGELES CA 90018
2 29.1 43056 1367 {'rings': [[[-118.2062542231383, 34.0348220536... 25 Facility 3 4060 E. WHITTIER BLVD. LOS ANGELES CA 90023
3 38.8 88405 7848 {'rings': [[[-118.37619999961632, 34.059440000... 49 Facility 4 6070 W. PICO BOULEVARD LOS ANGELES CA 90035
4 38.8 88405 7848 {'rings': [[[-118.37619999961632, 34.059440000... 55 Facility 5 1480 S. LA CIENEGA BL LOS ANGELES CA 90035

Visualize on a map

Let's visualize the 3 zip codes on a map.

In [11]:
map1 = gis.map('Los Angeles, CA',12)
map1