Modelling Power Load of Solar Energy

David Schulte

Course work: Statistical Tools in Finance and Insurance

Prof. Dr. López Cabrera

Imports and helper functions

Loading the data

First we load the data from all four network operators from 2010 until 2020.
The data can be accessed on https://energy-charts.info/.

We sum up the energy generation by the network operators to get the total energy generation. Since the data contains values for different time intervals, we aggregate them to get daily power generation.

This is how our data looks like.

Distribution of the data

Let us take a look at the distribution of daily power generation.

We can see that the distribution is skewed to the left. Since we will later apply a linear regression, we would prefer data that is approximately normally distributed. To shift our distribution, we apply the following transformation.

First, we apply Min-Max scaling.

$\tilde{U_t}=\frac{U_t-U_{min}}{U_{max}-U_{min}}$

Then, we apply a logit normal transformation to the scaled values.

$U^*=\log{\left(\frac{\tilde{U}_t}{1-\tilde{U}_t}\right)}$

Let us take a look at our transformed time series.

Seasonality

We can see a strong seasonal component in the data. To get rid of it, we apply a linear regression and continue working with its residuals.
After thinking about the underlying process behind the data and experimenting with it, we model our data as following:

$U^*_t = \beta_0 + \beta_1 \cdot t + \beta_2 \cdot \sqrt[4]t \cos \left(2\pi \frac{t-11}{365}\right)+X_t$

We will use the scikit-learn library for implementation. To get more more information about the regression, we will also conduct it using the statsmodels library and print the model summary.

Residuals after regression

The residuals look good, except in the very beginning. That is no problem, since we can just drop the first year of our data and work with the remaining 10 years.

We will print the RMSE of our residuals.

We can see that our residuals are approximately normally distributed.

Time series modelling

Now it is time to work on the time series. First, we inspect its partial autocorrelation.

Based on this result, we apply a ARIMA(1,0,1) model.

Again, we give out the residuals after applying the ARIMA model.

The statsmodels library returns four plots that describe the model fit.