Table of Contents ¶
- Connecting to ArcGIS
- Accessing & Visualizing the datasets
- Model Building
- 1 — FullyConnectedNetwork
- 2 — MLModel
- Summary of methods used
- Data resources
Recently, there has been a great emphasis on reducing the carbon footprint of our cities by moving away from fossil fuels to renewable energy sources. City governments across the world, in this case the City of Calgary, Canada, are leading this change by becoming energy independent through solar power plants, either implemented on rooftops or within city utility sites.
This notebook aims to further aid this move to renewable solar energy by predicting the annual solar energy likely to be generated by a solar power station through local weather information and site characteristics. The hypothesis proposed by this notebook is that various weather parameters, such as temperature, wind speed, vapor pressure, solar radiation, day length, precipitation, snowfall, along with the altitude of a solar power station site, will impact the daily generation of solar energy.
Accordingly, these variables are used to train a model on actual solar power generated by solar power stations located in Calgary. This trained model will then be used to predict solar generation from potential solar power plant sites. Beyond the weather and altitude variables, the total energy generated from a solar station will also depend on the capacity of that station. For instance, a 100kwp solar plant will generate more energy than a 50kwp plant, and the final output will therefore also take into consideration the capacity of each solar power plant.
%matplotlib inline import matplotlib.pyplot as plt import pandas as pd from pandas import read_csv from datetime import datetime from IPython.display import Image, HTML from sklearn.pipeline import make_pipeline from sklearn.preprocessing import Normalizer from sklearn.preprocessing import MinMaxScaler from sklearn.compose import make_column_transformer from sklearn.metrics import r2_score import arcgis from arcgis.gis import GIS from arcgis.learn import FullyConnectedNetwork, MLModel, prepare_tabulardata
gis = GIS('home')
The primary data used for this sample are as follows:
Out of the several solar photovoltaic power plants in the City of Calgary, 11 were selected for the study. The dataset contains two components:
1) Daily solar energy production for each power plant from September 2015 to December 2019.
2) Corresponding daily weather measurements for the given sites.
The datasets were obtained from multiple sources, as mentioned here (Data resources), and preprocessed to obtain the main dataset used in this sample. Two feature layers were subsequently created from this dataset, one dataset for training and the other for validating.
Training Set ¶
The training dataset consists of data from 10 solar sites for training the model. The feature layer containing the data is accessed here from Arcgis portal and visualized as follows:
# Access Solar Dataset feature layer for Training, without the Southland Solar Plant which is hold out for validation calgary_no_southland_solar = gis.content.get('adaead8cb3174ac6a89f0c14ae70aadd') calgary_no_southland_solar
# Access the layer from the feature layer calgary_no_southland_solar_layer = calgary_no_southland_solar.layers
# Plot location of the 10 Solar sites in Calgary to be used for training m1 = gis.map('calgary', zoomlevel=10) m1.add_layer(calgary_no_southland_solar_layer) m1