Which college district has the fewest low-income families?

A pilot program was run by a local cable operator in the county to provide low-cost computers and Internet access to low-income families with kids in high school. This showed a marked improvement in school performance for these kids, and the program has brought the company a fair amount of positive publicity and goodwill in the community.

Company officials now want to set up a similar program for community college students. The company provides Internet access to the five community college districts in the county, and officials are aware that the colleges are under a lot of pressure - they are facing funding cuts at the same time as increased demand for enrollment. To try to improve the situation the colleges are turning more and more to distance learning, primarily via the Internet. By providing computers and Internet access, the cable company can enable more low-income students to take advantage of online classes.

This case study uses ArcGIS API for Python to find districts that have the fewest low income families in order to empower these students.

We will use summarize_within tool to get the number of low-income families within each community district. We will also visualize this using the map widget.

Connect to your ArcGIS Online organization

We first establish a connection to our organization which could be an ArcGIS Online organization or an ArcGIS Enterprise. To be able to run the code using ArcGIS API for Python, we will need to provide credentials of a user within an ArcGIS Online organization.

from arcgis.gis import GIS
import pandas as pd

Please sign-in into your organization to continue to execute this notebook.

gis = GIS('home')

Get data for analysis

san_diego_data = gis.content.search('title:CommunityCollege_CensusTracts owner:api_data_owner', 
                                 'Feature layer',
                                  outside_org=True)
san_diego_data
[<Item title:"CommunityCollege_CensusTracts" type:Feature Layer Collection owner:api_data_owner>]
from IPython.display import display

for item in san_diego_data:
    display(item)
CommunityCollege_CensusTracts
Feature Layer Collection by api_data_owner
Last Modified: April 11, 2020
0 comments, 89 views
san_diego_item = san_diego_data[0] # get first item from the list of items
for lyr in san_diego_item.layers:
    print(lyr.properties.name)
census_tract_income
Community_College_Dist

Since the item is a Feature Layer Collection, accessing the layers property will give us a list of Feature Layers.

census_tract_income = san_diego_item.layers[0]
community_college_dist = san_diego_item.layers[1] 
m1 = gis.map('San Diego')
m1
m1.add_layer(community_college_dist)
m2 = gis.map('San Diego')
m2

Find the community college district with the fewest low income families

Convert the layer into pandas dataframe to calculate the number of households in each tract with income less than $30,000.

sdf = pd.DataFrame.spatial.from_layer(census_tract_income)
sdf.columns
Index(['FID', 'TRACT', 'INCOME_ALL', 'INCOME_LES', 'INCOME_10K', 'INCOME_15K',
       'INCOME_20K', 'INCOME_25K', 'INCOME_30K', 'INCOME_35K', 'INCOME_40K',
       'INCOME_45K', 'INCOME_50K', 'INCOME_60K', 'INCOME_75K', 'INCOME_100',
       'INCOME_125', 'INCOME_150', 'INCOME_200', 'Shape__Area',
       'Shape__Length', 'SHAPE'],
      dtype='object')
sdf.head()
FIDTRACTINCOME_ALLINCOME_LESINCOME_10KINCOME_15KINCOME_20KINCOME_25KINCOME_30KINCOME_35K...INCOME_50KINCOME_60KINCOME_75KINCOME_100INCOME_125INCOME_150INCOME_200Shape__AreaShape__LengthSHAPE
0177004148243205158195229279278...445526370379731271251724049.0195316919.424522{"rings": [[[-13051046.6746253, 3866695.333166...
1278002510294132180160135250116...280263178107645292889814.19921911223.567885{"rings": [[[-13049196.649225, 3869830.7042951...
2379012953240156154191209233168...3253932331504942251785775.156255749.634908{"rings": [[[-13051806.5792234, 3868598.509832...
3479032429154163184174171139195...2881453101243043191075470.9882814651.499315{"rings": [[[-13050375.5212048, 3868973.977334...
4579043157335219187208218199188...3043163261625367191318393.7539064961.527797{"rings": [[[-13050786.6266337, 3868042.625540...

5 rows × 22 columns

The census tract layer contains the number of households in each of several income categories, such as less than \$10,000, \$10,000 to \$15,000, \$15,000 to \$20,000, and so on.

The aim of the project is to provide support to families with an annual income less than \$30,000.

We will add a field to the census tract dataframe and sum the number of households in each tract with income less than \$30,000.

sdf['income_lt_30k'] = sdf['INCOME_LES'] + sdf['INCOME_10K'] + sdf['INCOME_15K'] + sdf['INCOME_20K'] + sdf['INCOME_25K']
sdf.income_lt_30k.head()
0    1030
1     901
2     950
3     846
4    1167
Name: income_lt_30k, dtype: Int32
sdf.head()
FIDTRACTINCOME_ALLINCOME_LESINCOME_10KINCOME_15KINCOME_20KINCOME_25KINCOME_30KINCOME_35K...INCOME_60KINCOME_75KINCOME_100INCOME_125INCOME_150INCOME_200Shape__AreaShape__LengthSHAPEincome_lt_30k
0177004148243205158195229279278...526370379731271251724049.0195316919.424522{"rings": [[[-13051046.6746253, 3866695.333166...1030
1278002510294132180160135250116...263178107645292889814.19921911223.567885{"rings": [[[-13049196.649225, 3869830.7042951...901
2379012953240156154191209233168...3932331504942251785775.156255749.634908{"rings": [[[-13051806.5792234, 3868598.509832...950
3479032429154163184174171139195...1453101243043191075470.9882814651.499315{"rings": [[[-13050375.5212048, 3868973.977334...846
4579043157335219187208218199188...3163261625367191318393.7539064961.527797{"rings": [[[-13050786.6266337, 3868042.625540...1167

5 rows × 23 columns

sdf.shape
(605, 23)

We will import the spatially enabled dataframe back into the GIS and create a feature layer.

census_tract = gis.content.import_data(sdf,
                                       title='CensusTract',
                                       tags='datascience')
census_tract
CensusTract
Feature Layer Collection by arcgis_python
Last Modified: April 18, 2023
0 comments, 0 views

Get the number of low-income households in each district

We will summarize census tracts by community college districts to find the total number of low-income households in each district. If a tract falls in two or more districts, the value for that tract will be split proportionally between the districts (based on the area of the tract in each district).

from arcgis.features.summarize_data import summarize_within
from datetime import datetime as dt
tracts_within_boundary = summarize_within(community_college_dist,
                                          census_tract,
                                          summary_fields=["income_lt_ SUM"],
                                          shape_units='SquareMiles',
                                          output_name='TractsWithinBoundary' + str(dt.now().microsecond))
{"cost": 0.61}
tracts_within_boundary
TractsWithinBoundary560119
Feature Layer Collection by jyaist_geosaurus
Last Modified: July 27, 2023
0 comments, 0 views
m3 = gis.map('San Diego')
m3
m3.add_layer(tracts_within_boundary)

The map displays the census tracts color-coded by the number of households in each census tract with income less than $30,000 per year.

tracts_within_boundary_lyr = tracts_within_boundary.layers[0]
sdf = pd.DataFrame.spatial.from_layer(tracts_within_boundary_lyr)
sdf.columns
Index(['OBJECTID_1', 'OBJECTID', 'DISTRICT', 'Shape_Leng', 'sum_income_lt_',
       'sum_Area_SquareMiles', 'Polygon_Count', 'AnalysisArea', 'SHAPE'],
      dtype='object')
sdf.sort_values(['sum_income_lt_'], inplace=True)
sdf.head()
OBJECTID_1OBJECTIDDISTRICTShape_Lengsum_income_lt_sum_Area_SquareMilesPolygon_CountAnalysisAreaSHAPE
123MIRA COSTA COMMUNITY COLLEGE529254.24232328286.961822179.90496787180.057307{"rings": [[[-13069560.2323, 3941041.6565], [-...
015SOUTHWESTERN COMMUNITY COLLEGE484545.19636640860.319742171.085801111171.34353{"rings": [[[-13045570.4191, 3857253.9374], [-...
341GROSSMONT-CUYAMACA COMMUNITY COLLEGE962386.01270448778.1273261137.0937331181137.329793{"rings": [[[-13000869.2404, 3890488.4302], [-...
452PALOMAR COMMUNITY COLLEGE1538204.6109956548.1788162554.6955551522554.78782{"rings": [[[-13078663.7712, 3962536.5573], [-...
234SAN DIEGO COMMUNITY COLLEGE608636.549263127840.876015217.566587253217.5859{"rings": [[[-13039483.6168, 3888486.6244], [-...

Visualization to show district with fewest households

m4 = gis.map('San Diego')
m4
m4.add_layer(tracts_within_boundary, {"renderer":"ClassedSizeRenderer",
                                      "field_name": "sum_income_lt_"})

It's clear that the Mira Costa district has by far the fewest low-income households. That's where the pilot program could be set up.

Conclusion

We have successfully located a district with the fewest low income families. We can assess the success of the project for the next 6 months and give recommendations to expand the program across other areas in the country.

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.