Skip To Content ArcGIS for Developers Sign In Dashboard

ArcGIS API for Python

Download the samples Try it live

Data Summarization - Construction permits, part 2/2

In the "Explore and analyze construction permits" notebook, we explored your data and learned a little about the spatial and temporal trends of permit activity in Montgomery County. In this lesson, we'll move beyond exploration and run spatial analysis tools to answer specific questions that can't be answered by the data itself. In particular, we want to know why permits spiked in Germantown in 2011 and predict where future permit spikes - and, by extension, future growth - are likely to occur.

First, we'll aggregate the points by ZIP Code. We'll enrich each ZIP Code with demographic information and learn more about the demographic conditions that led to such rapid growth in such a short time. Once you determine why growth occurred where and when it did, we'll locate other ZIP Codes with similar demographic characteristics to predict future growth.

Aggregate points

In [1]:
from arcgis import GIS
In [2]:
gis = GIS('home')
In [3]:
data = gis.content.search("Commercial_Permits_since_2010 owner:api_data_owner",
                          'Feature layer',
                           outside_org=True)
data[0]
Out[3]:
Commercial_Permits_since_2010
test dataFeature Layer Collection by api_data_owner
Last Modified: July 01, 2019
0 comments, 1 views
In [4]:
permits = data[0]
permit_layer = permits.layers[0]
In [5]:
zip_code = gis.content.search('title:ZIP Code Boundaries 2017 owner:esri_dm', 'Feature layer',
                           outside_org=True)
zip_code[0]
Out[5]:
United States ZIP Code Boundaries 2017
This layer shows the ZIP Code level boundaries of United States in 2017. The boundaries are optimized to improve Data Enrichment analysis performance.Feature Layer Collection by esri_dm
Last Modified: June 21, 2019
0 comments, 69,632 views
In [6]:
zip_item = zip_code[0]

The USA_ZIP_Code layer is added as a new item. Since the item is a feature layer collection, using the layers property will give us a list of layers.

In [7]:
for lyr in zip_item.layers:
    print(lyr.properties.name)
USA_Country
USA_State
USA_County
USA_ZipCode
USA_Tract
USA_BlockGroup
In [8]:
zip_code_layer = zip_item.layers[3]

Next, you'll use this layer to aggregate permit points. By default, the parameters are set to use the ZIP Codes as the area layer, the permits as the layer to be aggregated, and the layer style to be based on permit count. These parameters are exactly what you want.

In [9]:
from arcgis.features.summarize_data import aggregate_points
from datetime import datetime as dt
In [10]:
permit_agg_by_zip = aggregate_points(permit_layer, zip_code_layer, 
                                     keep_boundaries_with_no_points=False,
                                     output_name='zipcode_aggregate' + str(dt.now().microsecond))
In [11]:
permit_agg_by_zip
Out[11]:
zipcode_aggregate760079
Feature Layer Collection by arcgis_python
Last Modified: April 13, 2020
0 comments, 1 views

Aggregation results

In [12]:
agg_map = gis.map('Maryland')
agg_map
Out[12]:
In [13]:
agg_map.add_layer(permit_agg_by_zip)

The new layer looks like a point layer, but it's actually a polygon layer with a point symbology. Each point represents the number of permits per ZIP Code area. Larger points indicate ZIP Codes with more permits.

In [14]:
import pandas as pd
In [15]:
sdf = pd.DataFrame.spatial.from_layer(permit_agg_by_zip.layers[0])
In [16]:
sdf.head(10)
Out[16]:
AnalysisArea OBJECTID POPULATION PO_NAME Point_Count SHAPE SQMI STATE ZIP_CODE
0 2.356814 1 14652 Washington 6 {"rings": [[[-77.0266270001359, 38.98455799965... 2.36 DC 20012
1 6.248092 2 48592 Hyattsville 1 {"rings": [[[-76.9414389999099, 39.02912599996... 6.26 MD 20783
2 0.191061 3 219 Glen Echo 1 {"rings": [[[-77.1384300001444, 38.96841399980... 0.19 MD 20812
3 5.168331 4 30017 Bethesda 1145 {"rings": [[[-77.0943629995527, 39.02250799964... 5.17 MD 20814
4 5.360828 5 30001 Chevy Chase 586 {"rings": [[[-77.0635971995511, 39.01197539974... 5.35 MD 20815
5 4.607703 6 16967 Bethesda 154 {"rings": [[[-77.1429960002652, 38.97162000016... 4.61 MD 20816
6 13.889993 7 38385 Bethesda 732 {"rings": [[[-77.1267290001432, 39.02947299977... 13.89 MD 20817
7 0.978536 8 1383 Cabin John 13 {"rings": [[[-77.1573989998639, 38.98250600035... 0.98 MD 20818
8 9.426954 9 26858 Olney 216 {"rings": [[[-77.0921479999302, 39.16957599993... 9.43 MD 20832
9 22.871336 10 8380 Brookeville 40 {"rings": [[[-77.0616859999089, 39.27760500037... 22.87 MD 20833
In [17]:
sdf.reset_index(inplace=True)
In [18]:
sdf.head()
Out[18]:
index AnalysisArea OBJECTID POPULATION PO_NAME Point_Count SHAPE SQMI STATE ZIP_CODE
0 0 2.356814 1 14652 Washington 6 {"rings": [[[-77.0266270001359, 38.98455799965... 2.36 DC 20012
1 1 6.248092 2 48592 Hyattsville 1 {"rings": [[[-76.9414389999099, 39.02912599996... 6.26 MD 20783
2 2 0.191061 3 219 Glen Echo 1 {"rings": [[[-77.1384300001444, 38.96841399980... 0.19 MD 20812
3 3 5.168331 4 30017 Bethesda 1145 {"rings": [[[-77.0943629995527, 39.02250799964... 5.17 MD 20814
4 4 5.360828 5 30001 Chevy Chase 586 {"rings": [[[-77.0635971995511, 39.01197539974... 5.35 MD 20815

Review some basic statistics about the data.

In [19]:
sdf['Point_Count'].mean()
Out[19]:
249.42222222222222
In [20]:
sdf['Point_Count'].max()
Out[20]:
1145
In [21]:
sdf['Point_Count'].min()
Out[21]:
1
In [22]:
agg_layer = permit_agg_by_zip.layers[0]

Although most of the large point symbols on the map are in the southeast corner, near Washington, D.C., there are a few large points in the northwest. In particular, there is a very large circle in the ZIP Code located in Clarksburg. (If you're using different ZIP Code data, this area may be identified as ZIP Code 20871 instead.) The ZIP code has 948 permits. Additionally, this area geographically corresponds to the hot spot you identified in the previous lesson. This ZIP Code is one that you'll focus on when you enrich your layer with demographic data.

Enrich the data

Are there demographic characteristics about the Clarksburg ZIP Code that contributed to its high growth? If so, are there other areas with those characteristics that may experience growth in the future? To answer these questions, you'll use the Enrich Data analysis tool. This tool adds demographic attributes of your choice to your data. Specifically, you'll add Tapestry information to each ZIP Code. Tapestry is a summary of many demographic and socioeconomic variables, including age groups and lifestyle choices. It'll teach you more about the types of people who live in your area of interest and help you better understand the reasons why growth happened where it did.

In [23]:
from arcgis.features.enrich_data import enrich_layer
In [24]:
enrich_aggregate = enrich_layer(agg_layer, 
                                analysis_variables=["AtRisk.TSEGNAME"],
                                output_name="added_tapestry_var" + str(dt.now().microsecond))
In [25]:
enrich_aggregate
Out[25]:
added_tapestry_var285913
Feature Layer Collection by arcgis_python
Last Modified: April 13, 2020
0 comments, 0 views
In [26]:
agg_lyr = enrich_aggregate.layers[0]
In [27]:
sdf = pd.DataFrame.spatial.from_layer(agg_lyr)
In [28]:
sdf.head()
Out[28]:
AnalysisArea ENRICH_FID HasData ID OBJECTID POPULATION PO_NAME Point_Count SHAPE SQMI STATE TSEGNAME ZIP_CODE aggregationMethod apportionmentConfidence populationToPolygonSizeRating sourceCountry
0 2.356814 1 1 0 1 14652 Washington 6 {"rings": [[[-77.026627, 38.9845580000001], [-... 2.36 DC City Lights 20012 BlockApportionment:US.BlockGroups 2.576 2.191 US
1 6.248092 2 1 1 2 48592 Hyattsville 1 {"rings": [[[-76.9414389999999, 39.02912600000... 6.26 MD NeWest Residents 20783 BlockApportionment:US.BlockGroups 2.576 2.191 US
2 0.191061 3 1 2 3 219 Glen Echo 1 {"rings": [[[-77.13843, 38.9684140000001], [-7... 0.19 MD Urban Chic 20812 BlockApportionment:US.BlockGroups 2.576 2.191 US
3 5.168331 4 1 3 4 30017 Bethesda 1145 {"rings": [[[-77.094363, 39.0225080000001], [-... 5.17 MD Metro Renters 20814 BlockApportionment:US.BlockGroups 2.576 2.191 US
4 5.360828 5 1 4 5 30001 Chevy Chase 586 {"rings": [[[-77.0635971999999, 39.0119754], [... 5.35 MD Top Tier 20815 BlockApportionment:US.BlockGroups 2.576 2.191 US
In [29]:
enrich_aggregate_map = gis.map('Maryland')
In [30]:
enrich_aggregate_map
Out[30]: