--- title: GEFCom2012 keywords: fastai sidebar: home_sidebar summary: "Download the GEFCom2012 dataset." description: "Download the GEFCom2012 dataset." nb_path: "nbs/data_datasets__gefcom2012.ipynb" ---
%%html
<style> table {float:left} </style>
import matplotlib.pyplot as plt
from matplotlib import rcParams
plt.rcParams['font.family'] = 'serif'
FONTSIZE = 22
The GEFCom2012-L dataset was made available as part of a kaggle competition.
The competition asked for the creation of hierarchical forecasts for 20 zones and the system. For this purpose the sum of zonal loads should be equal to the system load. The evaluation metric was the Weighted Root Mean Square Error (WRMSE).
The task was to provide two day ahead hourly forecasts for the power generation of seven wind farms. The dataset contains:
Y_df, X_df, benchmark_df = GEFCom2012_L.load('data')
Y_df.head()
Y_df, X_df, benchmark_df = GEFCom2012_L.load(directory='data')
Y_df = Y_df[Y_df.unique_id==1]
ds = Y_df.ds.values[-365:]
y_true = Y_df.y.values[-365:]
x_plot = Y_df.ds.values
x_plot_min = pd.to_datetime(x_plot.min()).strftime('%B %d, %Y')
x_plot_max = pd.to_datetime(x_plot.max()).strftime('%B %d, %Y')
x_axis_str = f'Hours [{x_plot_min} to {x_plot_max}]'
y_axis_str = 'Load (MW)'
fig = plt.figure(figsize=(20, 4))
fig.tight_layout()
ax0 = plt.subplot2grid((1,1),(0, 0))
axs = [ax0]
axs[0].plot(ds, y_true, color='#628793', linewidth=0.4, label='true')
axs[0].tick_params(labelsize=FONTSIZE-4)
axs[0].set_xlabel(x_axis_str, fontsize=FONTSIZE)
axs[0].set_ylabel(y_axis_str, fontsize=FONTSIZE)
plt.title('GEFCom2012-W Zone=1', fontsize=FONTSIZE)
plt.grid()
plt.show()
The GEFCom2012-W dataset was made available as part of a kaggle competition.
The task was to provide two day ahead hourly forecasts for the power generation of seven wind farms. The dataset contains:
Y_df, X_df, benchmark_df = GEFCom2012_W.load(directory='data')
Y_df = Y_df[:168]
X_df = X_df[:168]
fig = plt.figure(figsize=(15, 4))
x_plot = Y_df.ds.values
x_plot_min = pd.to_datetime(x_plot.min()).strftime('%B %d, %Y')
x_plot_max = pd.to_datetime(x_plot.max()).strftime('%B %d, %Y')
x_axis_str = f'Hours [{x_plot_min} to {x_plot_max}]'
y_axis_str = 'U Wind Component'
plt.plot(x_plot, X_df.u_lead12, label='12 lead')
plt.plot(x_plot, X_df.u_lead24, label='24 lead')
plt.plot(x_plot, X_df.u_lead36, label='36 lead')
plt.plot(x_plot, X_df.u_lead48, label='48 lead')
plt.xlabel(x_axis_str, fontsize=FONTSIZE)
plt.ylabel(y_axis_str, fontsize=FONTSIZE)
plt.title('GEFCom2014-W', fontsize=FONTSIZE)
plt.legend()
plt.grid()
plt.show()