Part 2 - Where to enrich? (what are study areas?)

Enriching Study Areas

GeoEnrichment uses the concept of a study area to define the location of the point or area that you want to enrich with additional information or create reports about. The accepted forms of study areas are:

  1. Street address locations
    • a. Single line input
    • b. Multiple field input
  2. Point, line and polygon geometries
  3. Buffered study areas
  4. Named statistical areas

Before we look at the exmaples of study areas, let's understand the concept of Data collections and analysis variables. We will look at Data collections in detail in a later section.

Data collections and analysis variables

GeoEnrichment uses the concept of a data collection to define the data attributes (analysis variables) returned by the enrichment service. A data collection is a preassembled list of attributes that will be used to enrich the input features. Collection attributes can describe various types of information, such as demographic characteristics and geographic context of the locations or areas submitted as input features. We will introduce the concept of data collections here and look at the details in the next guide.

The Country class can be used to discover the data collections, sub-geographies and available reports for a country. When working with a particular country, you will find it convenient to get a reference to it using the Country.get() method.

The data_collections property of a Country object lists a combination of available data collections and analysis variables for each data collection as a Pandas dataframe.

Once we know the data collection we would like to use, we can look at all the unique analysisVariable available in that data collection.

Input
# Import Libraries
from arcgis.gis import GIS
from arcgis.geoenrichment import Country, enrich, BufferStudyArea
Input
# Create a GIS Connection
gis = GIS(profile='your_online_profile')
Input
# Get US as a country
usa = Country.get('US')
type(usa)
Output
arcgis.geoenrichment.enrichment.Country
Input
df = usa.data_collections

# print a few rows of the DataFrame
df.head()
Output
analysisVariable alias fieldCategory vintage
dataCollectionID
1yearincrements 1yearincrements.AGE0_CY 2020 Population Age <1 2020 Age: 1 Year Increments (Esri) 2020
1yearincrements 1yearincrements.AGE1_CY 2020 Population Age 1 2020 Age: 1 Year Increments (Esri) 2020
1yearincrements 1yearincrements.AGE2_CY 2020 Population Age 2 2020 Age: 1 Year Increments (Esri) 2020
1yearincrements 1yearincrements.AGE3_CY 2020 Population Age 3 2020 Age: 1 Year Increments (Esri) 2020
1yearincrements 1yearincrements.AGE4_CY 2020 Population Age 4 2020 Age: 1 Year Increments (Esri) 2020
Input
# call the shape property to get the total number of rows and columns
df.shape
Output
(17608, 4)

Each data collection can have multiple analysis variables as seen in the table above. Every such analysis variable has a unique ID, found in the analysisVariable column. When calling the enrich() method, these analysis variables can be passed in the data_collections and analysis_variables parameters.

You can filter the data_collections and query the collections analysis_variables using Pandas expressions.

Input
# get all the unique data collections available for the current country
df.index.unique()
Output
Index(['1yearincrements', '5yearincrements', 'ACS_Housing_Summary_rep',
       'ACS_Population_Summary_rep', 'Age', 'AgeDependency',
       'Age_50_Profile_rep', 'Age_by_Sex_Profile_rep',
       'Age_by_Sex_by_Race_Profile_rep', 'AtRisk',
       ...
       'transportation', 'travelMPI', 'unitsinstructure',
       'urbanizationgroupsNEW', 'vacant', 'vehiclesavailable', 'veterans',
       'women', 'yearbuilt', 'yearmovedin'],
      dtype='object', name='dataCollectionID', length=150)

The snippet below shows how you can query the Age data collection and get all the unique analysisVariables under that collection.

Input
df.loc['Age']['analysisVariable'].unique()
Output
array(['Age.MALE0', 'Age.MALE5', 'Age.MALE10', 'Age.MALE15', 'Age.MALE20',
       'Age.MALE25', 'Age.MALE30', 'Age.MALE35', 'Age.MALE40',
       'Age.MALE45', 'Age.MALE50', 'Age.MALE55', 'Age.MALE60',
       'Age.MALE65', 'Age.MALE70', 'Age.MALE75', 'Age.MALE80',
       'Age.MALE85', 'Age.FEM0', 'Age.FEM5', 'Age.FEM10', 'Age.FEM15',
       'Age.FEM20', 'Age.FEM25', 'Age.FEM30', 'Age.FEM35', 'Age.FEM40',
       'Age.FEM45', 'Age.FEM50', 'Age.FEM55', 'Age.FEM60', 'Age.FEM65',
       'Age.FEM70', 'Age.FEM75', 'Age.FEM80', 'Age.FEM85'], dtype=object)
Input
# View a sample of the `Age` data collection
df.loc['Age'].head()
Output
analysisVariable alias fieldCategory vintage
dataCollectionID
Age Age.MALE0 2020 Males Age 0-4 2020 Age: 5 Year Increments (Esri) 2020
Age Age.MALE5 2020 Males Age 5-9 2020 Age: 5 Year Increments (Esri) 2020
Age Age.MALE10 2020 Males Age 10-14 2020 Age: 5 Year Increments (Esri) 2020
Age Age.MALE15 2020 Males Age 15-19 2020 Age: 5 Year Increments (Esri) 2020
Age Age.MALE20 2020 Males Age 20-24 2020 Age: 5 Year Increments (Esri) 2020

Now, let's look at some examples of enriching each of the study areas.

Enriching street address

Street address locations can be passed as strings of input street addresses, points of interest or place names. A street address can be passed as a single line or as a multiple field input. If a point (e.g. a street address) is used as a study area, the service will create a 1 mile ring buffer around the point to collect and append enrichment data.

The example below uses a street address as a study area for enrichment using Age data collection.

Single line address

Input
# Enriching single address as single line imput
single_address = enrich(study_areas=["380 New York St Redlands CA 92373"], 
                       data_collections=['Age'])
Input
single_address
Output
ID OBJECTID sourceCountry X Y areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US -117.194872 34.057237 RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups ... 376 398 374 340 310 262 153 98 129 {"rings": [[[-117.19487199429183, 34.071745616...

1 rows × 50 columns

Visualize results on a map

The returned spatial dataframe can be visualized on a map as shown below:

Input
# Plot on a map
address_map = gis.map('Redlands, CA',13)
address_map

A buffer of 1 mile is created by default, as seen on this map, for any address.

Input
single_address.spatial.plot(address_map)
Output
True

Multiple addresses as single line input

Input
# Enriching multiple addresses as single line input
enrich(study_areas=[{"address":{"text":"12 Concorde Place Toronto ON M3C 3R8","sourceCountry":"Canada"}},
                    {"address":{"text":"380 New York St Redlands CA 92373","sourceCountry":"US"}}], 
       data_collections=['Age'])
Output
ID OBJECTID sourceCountry X Y areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod ... ECYPFA4549 ECYPFA5054 ECYPFA5559 ECYPFA6064 ECYPFA6569 ECYPFA7074 ECYPFA7579 ECYPFA8084 ECYPFA85P SHAPE
0 0 1 CA -79.328740 43.729720 RingBuffer esriMiles Miles 1 BlockApportionment:CAN.DA ... 1351.0 1264.0 1323.0 1138.0 1156.0 973.0 784.0 576.0 970.0 {"rings": [[[-79.3287400246266, 43.74420464321...
1 1 2 US -117.194872 34.057237 RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups ... NaN NaN NaN NaN NaN NaN NaN NaN NaN {"rings": [[[-117.19487199429183, 34.071745616...

2 rows × 50 columns

Multiple field input

Input
enrich(study_areas=[{"address":{"Address":"380 New York Street", 
                                "City":"Redlands", "Region":"CA", "Postal":92373}}], 
       data_collections=['Age'])
Output
ID OBJECTID sourceCountry X Y areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US -117.194872 34.057237 RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups ... 376 398 374 340 310 262 153 98 129 {"rings": [[[-117.19487199429183, 34.071745616...

1 rows × 50 columns

Enriching with various analysis variables for age such as FEM45, FEM50, FEM65 etc.

Input
enrich(study_areas=["380 New York St Redlands CA 92373"], 
       analysis_variables=["Age.FEM45","Age.FEM55","Age.FEM65"])
Output
ID OBJECTID sourceCountry X Y areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod populationToPolygonSizeRating apportionmentConfidence HasData FEM45 FEM55 FEM65 SHAPE
0 0 1 US -117.194872 34.057237 RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups 2.191 2.576 1 376 374 310 {"rings": [[[-117.19487199429183, 34.071745616...

Enriching point, line and polygon geometries

Point geometries can be passed as x and y coordinates to study_areas parameter. When points are specified as study areas, the service will analyze map areas surrounding or associated with the input point locations. Unless otherwise specified, the service will analyze a one mile ring around a point. This is also true for a line. Locations can also be given as polygon geometries.

Single Point described as map coordinates

Input
from arcgis.geometry import Point
Input
pt = Point({"x" : -117.1956, "y" : 34.0572, "spatialReference" : {"wkid" : 4326}})
enrich(study_areas=[pt], data_collections=['Age'])
Output
ID OBJECTID sourceCountry areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod populationToPolygonSizeRating apportionmentConfidence ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups 2.191 2.576 ... 364 388 361 329 300 253 147 92 122 {"rings": [[[-117.19559999999998, 34.071708616...

1 rows × 48 columns

Multiple points with attributes described as map coordinates

Input
pt1 = Point({"x" : -122.435, "y" : 37.785, "spatialReference" : {"wkid" : 4326}})
pt2 = Point({"x" : -122.433, "y" : 37.734, "spatialReference" : {"wkid" : 4326}})

enrich(study_areas=[pt1, pt2], data_collections=['Age'])
Output
ID OBJECTID sourceCountry areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod populationToPolygonSizeRating apportionmentConfidence ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups 2.191 2.576 ... 2994 2581 2615 2773 2602 2394 1926 1564 2351 {"rings": [[[-122.43499999999999, 37.799499596...
1 1 2 US RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups 2.191 2.576 ... 2444 2373 2378 2164 2004 1660 1097 798 1040 {"rings": [[[-122.43299999999999, 37.748499722...

2 rows × 48 columns

Line feature described as geometry

Input
from arcgis.geometry import Polyline
Input
line = Polyline({"paths":[[[-13048580,4036370],[-13046151,4036366]]],
                 "spatialReference":{"wkid":102100}})
enriched_line_df = enrich(study_areas=[line], data_collections=['Age'])
Input
enriched_line_df
Output
ID OBJECTID sourceCountry areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod populationToPolygonSizeRating apportionmentConfidence ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US RingBuffer esriMiles Miles 1 BlockApportionment:US.BlockGroups 2.191 2.576 ... 585 585 528 498 443 389 227 151 228 {"rings": [[[-117.21736177272676, 34.070851408...

1 rows × 48 columns

Visualize results on a map

The returned spatial dataframe can be visualized on a map as shown below:

Input
# Plot on a map
line_map = gis.map('Redlands, CA',13)
line_map

We can clearly see the line and a 1 mile buffer around the line in this map

Input
# Draw line
line_map.draw(line)

# Plot enriched area around line
enriched_line_df.spatial.plot(line_map)
Output
True

Map area described as polygons

Input
from arcgis.geometry import Polygon
Input
poly = Polygon({"rings":[[[-117.185412,34.063170],[-122.81,37.81],
                        [-117.200570,34.057196],[-117.185412,34.063170]]],
                        "spatialReference":{"wkid":4326}})

enrich(study_areas=[poly], data_collections=['Age'])
Output
ID OBJECTID sourceCountry aggregationMethod populationToPolygonSizeRating apportionmentConfidence HasData MALE0 MALE5 MALE10 ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US BlockApportionment:US.BlockGroups 2.191 2.576 1 5532 5473 5286 ... 3865 3905 3896 3528 2710 1997 1303 844 943 {"rings": [[[-117.20057, 34.057196], [-122.809...

1 rows × 44 columns

Enriching Buffered study areas

BufferStudyArea instances are used to change the ring buffer size or create drive-time service areas around points specified using one of the above methods. BufferStudyArea allows you to buffer point and street address study areas. They can be created using the following parameters:

    * area: the point geometry or street address (string) study area to be buffered
    * radii: list of distances by which to buffer the study area, eg. [1, 2, 3]
    * units: distance unit, eg. Miles, Kilometers, Minutes (when using drive times/travel_mode)
    * overlap: boolean, uses overlapping rings/network service areas when True, or non-overlapping disks when False
    * travel_mode: None or string, one of the supported travel modes when using network service areas


BufferStudyArea also allows you to define drive time service areas around points as well as other advanced service areas such as walking and trucking.

Buffering location using driving distance

The example below creates disks of radii 1, 3 and 5 Miles respectively from a street address and enriches these using the 'Age' data collection.

Input
buffered = BufferStudyArea(area='380 New York St Redlands CA 92373',
                           radii=[1,3,5], units='Miles', overlap=False)
drive_dist_df = enrich(study_areas=[buffered], data_collections=['Age'])
Input
drive_dist_df
Output
ID OBJECTID sourceCountry X Y areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US -117.194872 34.057237 RingBufferBands Miles Miles 1 BlockApportionment:US.BlockGroups ... 376 398 374 340 310 262 153 98 129 {"rings": [[[-117.19487199429183, 34.071745616...
1 0 2 US -117.194872 34.057237 RingBufferBands Miles Miles 3 BlockApportionment:US.BlockGroups ... 1935 1936 1999 2073 1789 1430 986 719 947 {"rings": [[[-117.19487199429183, 34.100762745...
2 0 3 US -117.194872 34.057237 RingBufferBands Miles Miles 5 BlockApportionment:US.BlockGroups ... 2375 2493 2601 2478 2008 1500 1062 780 1018 {"rings": [[[-117.19487199429183, 34.129779737...

3 rows × 50 columns

Visualize results on a map

The returned spatial dataframe can be visualized on a map as shown below:

Input
# Plot on a map
buffer_map1 = gis.map('Redlands, CA')
buffer_map1.basemap = 'dark-gray-vector'
buffer_map1
Input
drive_dist_df.spatial.plot(map_widget=buffer_map1,
               renderer_type='c',  # for class breaks renderer
               method='esriClassifyNaturalBreaks',  # classification algorithm
               class_count=4,  # choose the number of classes
               col='bufferRadii',  # numeric column to classify
               cmap='viridis',  # color map to pick colors from for each class
               alpha=0.7  # specify opacity
               )
Output
True

Buffering location using drive times

The example below creates 5 and 10 minute drive times from a street address and enriches these using the 'Age' data collection.

Input
buffered = BufferStudyArea(area='380 New York St Redlands CA 92373', 
                           radii=[5, 10], units='Minutes', 
                           travel_mode='Driving')
drive_time_df = enrich(study_areas=[buffered], data_collections=['Age'])
Input
drive_time_df
Output
ID OBJECTID sourceCountry X Y areaType bufferUnits bufferUnitsAlias bufferRadii aggregationMethod ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US -117.194872 34.057237 NetworkServiceArea Minutes Drive Time Minutes 5 BlockApportionment:US.BlockGroups ... 658 661 657 617 567 463 300 212 280 {"rings": [[[-117.19120531126502, 34.080999174...
1 0 2 US -117.194872 34.057237 NetworkServiceArea Minutes Drive Time Minutes 10 BlockApportionment:US.BlockGroups ... 3208 3232 3352 3355 2874 2259 1545 1145 1651 {"rings": [[[-117.19165446621216, 34.143207222...

2 rows × 50 columns

Visualize results on a map

The returned spatial dataframe can be visualized on a map as shown below:

Input
# Plot on a map
buffer_map2 = gis.map('Redlands, CA')
buffer_map2.basemap = 'dark-gray-vector'
buffer_map2
Input
drive_time_df.spatial.plot(map_widget=buffer_map2,
                   renderer_type='c',  # for class breaks renderer
                   method='esriClassifyNaturalBreaks',  # classification algorithm
                   class_count=3,  # choose the number of classes
                   col='bufferRadii',  # numeric column to classify
                   cmap='viridis',  # color map to pick colors from for each class
                   alpha=0.7  # specify opacity
                   )
Output
True

Enriching a named statistical area

In all previous examples of different study area types, locations were defined as either points or polygons. Study area locations can also be passed as one or many named statistical areas. This form of study area lets you define an area as a standard geographic statistical feature, such as a census or postal area, for example, to obtain enrichment information for a U.S. state, county, or ZIP Code or a Canadian province or postal code. We will explore Named statistical areas in detail in the next section.

Enriching a zip code

Enriching zip code 92373 in California using the 'Age' data collection:

Input
usa = Country.get('US')
Input
redlands = usa.subgeographies.states['California'].zip5['92373']
Input
type(redlands)
Output
arcgis.geoenrichment.enrichment.NamedArea
Input
redlands
Output
<NamedArea name:"Redlands" area_id="92373", level="US.ZIP5", country="United States">
Input
redlands_df = enrich(study_areas=[redlands], data_collections=['Age'] )
Input
redlands_df
Output
ID OBJECTID StdGeographyLevel StdGeographyName StdGeographyID sourceCountry aggregationMethod populationToPolygonSizeRating apportionmentConfidence HasData ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US.ZIP5 Redlands 92373 US Query:US.ZIP5 2.191 2.576 1 ... 1024 1089 1113 1184 1101 970 662 475 701 {"rings": [[[-117.13603999963594, 34.032169999...

1 rows × 47 columns

Visualize results on a map

The returned spatial dataframe can be visualized on a map as shown below:

Input
zip_map = gis.map('Redlands, CA')
zip_map
Input
redlands_df.spatial.plot(zip_map)
Output
True

Enriching all counties in a state

Input
ca_counties = usa.subgeographies.states['California'].counties
Input
counties_df = enrich(study_areas=ca_counties, data_collections=['Age'])
counties_df.head()
Output
ID OBJECTID StdGeographyLevel StdGeographyName StdGeographyID sourceCountry aggregationMethod populationToPolygonSizeRating apportionmentConfidence HasData ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US.Counties Alameda County 06001 US Query:US.Counties 2.191 2.576 1 ... 55457 54029 55299 50442 42912 34127 22548 15100 18391 {"rings": [[[-122.2716789998329, 37.9047240001...
1 1 2 US.Counties Alpine County 06003 US Query:US.Counties 2.191 2.576 1 ... 38 42 62 39 55 30 12 10 5 {"rings": [[[-119.90059599883206, 38.930759999...
2 2 3 US.Counties Amador County 06005 US Query:US.Counties 2.191 2.576 1 ... 1011 1162 1591 1813 1739 1527 968 667 733 {"rings": [[[-120.07763899965389, 38.708886999...
3 3 4 US.Counties Butte County 06007 US Query:US.Counties 2.191 2.576 1 ... 5383 5817 6860 7023 6568 5293 3614 2437 3396 {"rings": [[[-121.4046210002662, 40.1466409995...
4 4 5 US.Counties Calaveras County 06009 US Query:US.Counties 2.191 2.576 1 ... 1352 1649 2219 2371 2307 1942 1206 727 688 {"rings": [[[-120.07246000003855, 38.509156000...

5 rows × 47 columns

Visualize results on a map
Input
county_map = gis.map('California')
county_map
Output
Input
counties_df.spatial.plot(map_widget=county_map,
               renderer_type='c',  # for class breaks renderer
               method='esriClassifyNaturalBreaks',  # classification algorithm
               class_count=5,  # choose the number of classes
               col='FEM75',  # numeric column to classify
               cmap='viridis',  # color map to pick colors from for each class
               alpha=0.7  # specify opacity
               )
Output
True
Input
county_map.legend=True

Using comparison levels

Using comparison_levels the information for the study areas can also be compared with standard geography areas in other levels. For example, if the study area is a zip code, you can compare enriched information for this zip code with information for the county or the state.

Example 1

Let's look at an example of enriching a zip code (study area) and then comparing its enrichment information with information for the county to which the zip code belongs using comparison_levels.

Input
fontana = usa.subgeographies.states['California'].zip5['92336']
Input
testdf1 = enrich(study_areas=[fontana], data_collections=['Age'], 
                 comparison_levels=['US.Counties'])
Input
testdf1.head()
Output
ID OBJECTID StdGeographyLevel StdGeographyName StdGeographyID sourceCountry aggregationMethod populationToPolygonSizeRating apportionmentConfidence HasData ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US.ZIP5 Fontana 92336 US Query:US.ZIP5 2.191 2.576 1 ... 3458 3134 2975 2353 1777 1176 706 389 366 {"rings": [[[-117.42984999972606, 34.187269999...
1 0 2 US.Counties San Bernardino County 06071 US Query:US.Counties 2.191 2.576 1 ... 65463 64654 66066 60308 49629 36487 23911 15572 16245 None

2 rows × 47 columns

The first row in the table above shows data for the requested zip and the second row has the data it was compared against - US.counties. We can see how using County as the comparison_level, we are able to compare the enriched study area (zip code) with information for the county it was compared to.

Example 2

Let's look at another example. In this example below, the 92373 zip code in Redlands intersects with both Riverside and San Bernardino counties in California. Hence, when using comparsion_levels both these counties are returned along with the results for the named zip code. We can also add State to the list of comparsion_levels to output results for counties and well as states.

Input
redlands = usa.subgeographies.states['California'].zip5['92373']
Input
testdf2 = enrich(study_areas=[redlands], data_collections=['Age'], 
       comparison_levels=['US.Counties', 'US.States'])
Input
testdf2.columns
Output
Index(['ID', 'OBJECTID', 'StdGeographyLevel', 'StdGeographyName',
       'StdGeographyID', 'sourceCountry', 'aggregationMethod',
       'populationToPolygonSizeRating', 'apportionmentConfidence', 'HasData',
       'MALE0', 'MALE5', 'MALE10', 'MALE15', 'MALE20', 'MALE25', 'MALE30',
       'MALE35', 'MALE40', 'MALE45', 'MALE50', 'MALE55', 'MALE60', 'MALE65',
       'MALE70', 'MALE75', 'MALE80', 'MALE85', 'FEM0', 'FEM5', 'FEM10',
       'FEM15', 'FEM20', 'FEM25', 'FEM30', 'FEM35', 'FEM40', 'FEM45', 'FEM50',
       'FEM55', 'FEM60', 'FEM65', 'FEM70', 'FEM75', 'FEM80', 'FEM85', 'SHAPE'],
      dtype='object')
Input
testdf2.head()
Output
ID OBJECTID StdGeographyLevel StdGeographyName StdGeographyID sourceCountry aggregationMethod populationToPolygonSizeRating apportionmentConfidence HasData ... FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 0 1 US.ZIP5 Redlands 92373 US Query:US.ZIP5 2.191 2.576 1 ... 1024 1089 1113 1184 1101 970 662 475 701 {"rings": [[[-117.13603999963594, 34.032169999...
1 0 2 US.Counties Riverside County 06065 US Query:US.Counties 2.191 2.576 1 ... 71840 71254 73314 68647 60386 49449 35062 23797 25463 None
2 0 3 US.Counties San Bernardino County 06071 US Query:US.Counties 2.191 2.576 1 ... 65463 64654 66066 60308 49629 36487 23911 15572 16245 None
3 0 4 US.States California 06 US Query:US.States 2.191 2.576 1 ... 1222805 1227409 1264110 1182621 1012441 803457 549654 374499 447146 None

4 rows × 47 columns

Input
testdf2.iloc[:, -20:]
Output
MALE85 FEM0 FEM5 FEM10 FEM15 FEM20 FEM25 FEM30 FEM35 FEM40 FEM45 FEM50 FEM55 FEM60 FEM65 FEM70 FEM75 FEM80 FEM85 SHAPE
0 380 877 859 940 902 1187 1190 1150 1273 1041 1024 1089 1113 1184 1101 970 662 475 701 {"rings": [[[-117.13603999963594, 34.032169999...
1 17664 85123 86025 83957 80112 80542 98617 91315 79758 71974 71840 71254 73314 68647 60386 49449 35062 23797 25463 None
2 9848 79131 78595 76288 71381 78383 95432 87077 73658 65402 65463 64654 66066 60308 49629 36487 23911 15572 16245 None
3 274381 1221927 1234947 1248664 1240920 1337702 1543578 1464688 1349773 1210247 1222805 1227409 1264110 1182621 1012441 803457 549654 374499 447146 None

To understand how the data is different between the zip code, the two counties and the state, let's plot the male to female ratio for ages 80-85.

Input
# Create a dataframe with new Male_Female_Ratio column
bar_df = testdf2.loc[:,['StdGeographyName','FEM80','MALE80']]
bar_df['Male_Female_Ratio'] = bar_df['MALE80'] / bar_df['FEM80']
bar_df
Output
StdGeographyName FEM80 MALE80 Male_Female_Ratio
0 Redlands 475 345 0.726316
1 Riverside County 23797 19288 0.810522
2 San Bernardino County 15572 11698 0.751220
3 California 374499 283158 0.756098
Input
# Plot the Male_Female_Ratio
import matplotlib.pyplot as plt
%matplotlib inline
plt.figure(figsize = (8,5))
plt.bar(x = 'StdGeographyName', height = 'Male_Female_Ratio', data = bar_df)
plt.title('Male to Female Ratio Comparison');

From the above plot, we can see minor difference in the male/female ratio between two counties, the state and Redlands (Zip=92373).

Enriching Arbitrary Geometries

Enrichment not only works on clearly defined geometries such as county or state boundaries but it can also power arbitrary goemetires (random polygon on a map or an area covering parts of different counties etc.) just as well. Let's look at an example of how an arbitrary geometry can be enrich()ed.

In this example, we will:

  1. Draw a map of Los Angeles, CA
  2. Ask the user to draw a polygon on the map
  3. Enrich the polygon drawn by the user
  4. Visualize enriched geometry on a map

Create a Map

Input
la_map = gis.map('Los Angeles, CA')
la_map

Enable User Input

Here, we will define a callback function that enables user input. If no input is provided, a default polygon geometry will be enriched.

Input
# Define the callback function.
drawn_polygon = None
def draw_poly(la_map, g):
    global drawn_polygon
    drawn_polygon = g

# Set draw_poly as the callback function to be invoked when a polygon is drawn on the map
drawn_polygon = la_map.on_draw_end(draw_poly)

Now, run the cell below and then draw a polygon on la_map, finish drawing by double clicking the mouse pointer. If no map is drawn within 30 seconds, a default polygon geometry will be used for enrichment.

Input
import time
# Draw polygon
la_map.draw("polygon")

# Sleep for 30 seconds
time.sleep(30)

# Use this as default polygon if no polygon drawn on map
drawn_polygon = {'spatialReference': {'latestWkid': 3857, 'wkid': 102100},
 'rings': [[[-13176442.352731517, 4035051.715228523],
   [-13167152.267973447, 4032788.462594141],
   [-13169738.58648519, 4023675.1384639805],
   [-13178995.82720767, 4028428.5661604665],
   [-13176442.352731517, 4035051.715228523]]]}
Input
# Check drawn polygon
drawn_polygon
Output
{'spatialReference': {'latestWkid': 3857, 'wkid': 102100},
 'rings': [[[-13181795.930532897, 4037434.4007196007],
   [-13169038.11278067, 4034871.3716149125],
   [-13172126.646454819, 4026770.8381095314],
   [-13180793.88886522, 4029142.1774792233],
   [-13187354.932847794, 4034394.834516697],
   [-13181795.930532897, 4037434.4007196007]]]}

Enrich Drawn Geometry

Input
from arcgis.geometry import Polygon
poly = Polygon(drawn_polygon)
Input
enriched_line_df2 = enrich(study_areas=[poly], 
       analysis_variables=["Age.FEM45","Age.FEM55","Age.FEM65"])

enriched_line_df2
Output
ID OBJECTID sourceCountry aggregationMethod populationToPolygonSizeRating apportionmentConfidence HasData FEM45 FEM55 FEM65 SHAPE
0 0 1 US BlockApportionment:US.BlockGroups 2.191 2.576 1 11964 12174 9685 {"rings": [[[-118.41408756542211, 34.064264163...

Visualize Enriched Geometry

Input
# Plot on a map
poly_map2 = gis.map('Los Angeles, CA')
poly_map2
Output

We can clearly see the enriched geometry on this map. Clicking on the geometry will display enriched features.

Input
# Plot enriched area around line
enriched_line_df2.spatial.plot(poly_map2)
Output
True

Conclusion

In this part of the arcgis.geoenrichment module guide series, you were introduced to the concept of study areas and how Geoenrichment uses a study area to define the location of the point, polyline or area that you want to enrich. You have also seen in detail how different types of study areas can be enriched and visualized on a map.

In the subsequent pages, you will learn about:

  1. Exploring Named Statistical Areas (explains where to enrich continued)
  2. Data Collections and GeoEnrichment coverage (explains what datasets/variables to enrich with)
  3. Generating Reports
  4. Standard Geography Queries

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.