- Importing libraries
- Connecting to your GIS
- Accessing & visualizing the dataset
- Time series data preprocessing
- Time series model building
- Air temperature forecast & validation
- Summary of methods used
- Data resources
A rise in air temperature is directly correlated with Global warming and change in climatic conditions and is one of the main factors in predicting other meteorological variables, like streamflow, evapotranspiration, and solar radiation. As such, accurate forecasting of this variable is vital in pursuing the mitigation of environmental and economic destruction. Including the dependency of air temperature in other variables, like wind speed or precipitation, helps in deriving more precise predictions. In this study, the deep learning TimeSeriesModel from arcgis.learn is used to predict monthly air temperature for two years at a ground station at the Fresno Yosemite International Airport in California, USA. The dataset ranges from 1948-2015. Data from January 2014 to November 2015 is used to validate the quality of the forecast.
Univariate time series modeling is one of the more popular applications of time series analysis. This study includes multivariate time series analysis, which is a bit more convoluted, as the dataset contains more than one time-dependent variable. The TimeSeriesModel from arcgis.learn includes backbones, such as InceptionTime, ResCNN, ResNet and FCN, which do not need fine-tuning of multiple hyperparameters before fitting the model. Here is the schematic flow chart of the methodology:
%matplotlib inline import matplotlib.pyplot as plt import numpy as np import pandas as pd from pandas.plotting import autocorrelation_plot as aplot from sklearn.preprocessing import MinMaxScaler from sklearn.model_selection import train_test_split from sklearn.metrics import r2_score import sklearn.metrics as metrics from arcgis.gis import GIS from arcgis.learn import TimeSeriesModel, prepare_tabulardata from arcgis.features import FeatureLayer, FeatureLayerCollection
gis = GIS('home')
The data used in this sample study is a multivariate monthly time series dataset recorded at a ground station in the Fresno Yosemite International Airport, California, USA. It ranges from January 1948 to November 2015.
# Location of the ground station location = gis.map(location="Fresno Yosemite International California", zoomlevel=12) location