Table of Contents¶
- Necessary Imports
- Accessing the Airbnb data as of 2019 and the NYC tracts dataset
- Visualizing dataset
- Aggregating Airbnb count by Tracts for NYC
- Importing demographic data using geoenrichment service
- Estimating distances of tracts from various city features
- Importing Borough Info for each Tract
- Merging all the above estimated data set of features
- Adding census data 2019 obtained using data enrich tool
- Model Building
- Running cross validation
- Result Visualization
- Data resources
- Summary of methods used
Airbnb properties across cities are a great alternative for travellers to find comparatively cheaper accommodation. It also provides homeowners opportunities to utilize spare or unused rooms as an additional income source. However in recent times the alarming spread of Airbnb properties has become a topic of debate among the public and the city authorities across the world.
Considering the above, a study is carried out in this sample notebook to understand the factors that are fuelling widespread growth in the number of Airbnb listings. These might include location characteristics of concerned neighbourhoods (which in this case, NYC census tracts) and as well as qualitative information about the inhabitants residing in them. The goal is to help city planners deal with the negative externalities of the Airbnb phenomenon (and similar short term rentals) by making informed decision on framing suitable policies.
The primary data is downloaded from the Airbnb website for the city of New York. Other data includes 2019 and 2017 census data using Esri's enrichment services, and various other datasets from the NYCOpenData portal.
%matplotlib inline import matplotlib.pyplot as plt from datetime import datetime import pandas as pd import numpy as np from IPython.display import display, HTML from IPython.core.pylabtools import figsize import seaborn as sns # Machine Learning models from sklearn.linear_model import LinearRegression from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor from sklearn.model_selection import train_test_split from sklearn.model_selection import cross_val_score import sklearn.metrics as metrics from sklearn import preprocessing # Arcgis api imports import arcgis from arcgis.geoenrichment import Country from arcgis.features import summarize_data from arcgis.features.enrich_data import enrich_layer from arcgis.features import SpatialDataFrame from arcgis.features import use_proximity from arcgis.gis import GIS from arcgis.features import summarize_data
gis = GIS('home')
Access the NYC Airbnb and Tracts dataset ¶
Airbnb Data - It contains information about 48,000 Airbnb properties available in New York as of 2019. These include location of the property, its neighbourhood characters and transit facilities available, information about the owner, details of the room including number of bedrooms etc., and rental price per night.
NYC Tracts - It is a polygon shapefile consisting 2167 tracts of New York City, including area of the tracts along with unique id for each tract.
# Accessing NYCTracts nyc_tract_full = gis.content.search('NYCTractData owner:api_data_owner', 'feature layer') nyc_tract_full
nyc_tracts_layer = nyc_tract_full.layers
# Accessing airbnb NYC airbnb_nyc2019 = gis.content.search('AnBNYC2019 owner:api_data_owner', 'feature layer') airbnb_nyc2019
airbnb_layer = airbnb_nyc2019.layers
# NYC Tracts m1 = gis.map('New York City') m1.add_layer(nyc_tracts_layer) m1