ArcGIS Developers

ArcGIS API for Python

Download the samples Try it live

Prediction of energy generation from Solar Photovoltaic Power Plants using weather variables


Recently, there has been a great emphasis on reducing the carbon footprint of our cities by moving away from fossil fuels to renewable energy sources. City governments across the world, in this case the City of Calgary, Canada, are leading this change by becoming energy independent through solar power plants, either implemented on rooftops or within city utility sites.

This notebook aims to further aid this move to renewable solar energy by predicting the annual solar energy likely to be generated by a solar power station through local weather information and site characteristics. The hypothesis proposed by this notebook is that various weather parameters, such as temperature, wind speed, vapor pressure, solar radiation, day length, precipitation, snowfall, along with the altitude of a solar power station site, will impact the daily generation of solar energy.

Accordingly, these variables are used to train a model on actual solar power generated by solar power stations located in Calgary. This trained model will then be used to predict solar generation from potential solar power plant sites. Beyond the weather and altitude variables, the total energy generated from a solar station will also depend on the capacity of that station. For instance, a 100kwp solar plant will generate more energy than a 50kwp plant, and the final output will therefore also take into consideration the capacity of each solar power plant.


In [1]:
%matplotlib inline
import matplotlib.pyplot as plt

import pandas as pd
from pandas import read_csv
from datetime import datetime
from IPython.display import Image, HTML

from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import Normalizer
from sklearn.preprocessing import MinMaxScaler  
from sklearn.compose import make_column_transformer 
from sklearn.metrics import r2_score

import arcgis
from arcgis.gis import GIS
from arcgis.learn import FullyConnectedNetwork, MLModel, prepare_tabulardata

Connecting to ArcGIS

In [2]:
gis = GIS('home')

Accessing & Visualizing datasets

The primary data used for this sample are as follows:

Out of the several solar photovoltaic power plants in the City of Calgary, 11 were selected for the study. The dataset contains two components:

1) Daily solar energy production for each power plant from September 2015 to December 2019.

2) Corresponding daily weather measurements for the given sites.

The datasets were obtained from multiple sources, as mentioned here (Data resources), and preprocessed to obtain the main dataset used in this sample. Two feature layers were subsequently created from this dataset, one dataset for training and the other for validating.

Training Set

The training dataset consists of data from 10 solar sites for training the model. The feature layer containing the data is accessed here from Arcgis portal and visualized as follows:

In [3]:
# Access Solar Dataset feature layer for Training, without the Southland Solar Plant which is hold out for validation
calgary_no_southland_solar = gis.content.get('adaead8cb3174ac6a89f0c14ae70aadd')
Feature Layer Collection by api_data_owner
Last Modified: July 07, 2020
0 comments, 2 views
In [4]:
# Access the layer from the feature layer
calgary_no_southland_solar_layer = calgary_no_southland_solar.layers[0]
In [5]:
# Plot location of the 10 Solar sites in Calgary to be used for training
m1 ='calgary', zoomlevel=10)