Core Elements for Multi Step Ahead Prediction

Core Elements for Multi Step Ahead Prediction
from nbdev.config import get_config

Create a dataloader for multi step ahead prediction

We create a prediction dataloader by including the system output with a timeshift to the input. The output is shifted one step so that it is always older than the predicted one.

project_root = get_config().config_file.parent
f_path = project_root / 'test_data/WienerHammerstein'
hdf_files = get_hdf_files(f_path)
init_sz = 300
u = ['u']
y = ['y']
seq = DataBlock(blocks=(SequenceBlock.from_hdf(u+y,TensorSequencesInput,clm_shift=[0,-1]),
                        SequenceBlock.from_hdf(y,TensorSequencesOutput,clm_shift=[-1])),
                 get_items=CreateDict([DfHDFCreateWindows(win_sz=500+1,stp_sz=100,clm='u')]),
                 splitter=ApplyToDict(FuncSplitter(lambda o: 'valid' in str(o))))
dls_pred = seq.dataloaders(hdf_files,bs=32,shuffle=True)
dls_pred.one_batch()[0][0].shape,dls_pred.one_batch()[0][1].shape
(torch.Size([500, 2]), torch.Size([500, 2]))
dls_pred.show_batch(max_n=1)

Create a learner Callback for prediction

Instead of creating a specialized dataloader we can instead create a Callback for the learner to add the historic system output data. This creates more flexibility for the learner and requires only one kind of dataloader per dataset.


source

PredictionCallback

 PredictionCallback (t_offset=1, std=None, mean=None)

Concatenates the optionally normalized system output to the input data for autoregression, assumes 1-tuple as input

Type Default Details
t_offset int 1 the number of steps output is shifted in the past, shortens the sequence length by that number
std NoneType None standard deviation of the output tensor
mean NoneType None mean of the output tensor

We create a dataloader without system output as input and compare the signals after the callback.

seq = DataBlock(blocks=(SequenceBlock.from_hdf(u,TensorSequencesInput),
                        SequenceBlock.from_hdf(y,TensorSequencesOutput)),
                 get_items=CreateDict([DfHDFCreateWindows(win_sz=500,stp_sz=100,clm='u')]),
                 splitter=ApplyToDict(FuncSplitter(lambda o: 'valid' in str(o))))
dls = seq.dataloaders(hdf_files,bs=32)

Evaluate that a simulation model works with the dataset

model = SimpleRNN(1,1)
Learner(dls,model,loss_func=nn.MSELoss()).fit(1)
0.00% [0/1 00:00<?]
epoch train_loss valid_loss time

0.00% [0/52 00:00<?]

and a prediction model which expects a 2d input does not work without the callback

model = SimpleRNN(2,1)
test_fail(lambda: Learner(dls,model,loss_func=nn.MSELoss()).fit(1))
0.00% [0/1 00:00<?]
epoch train_loss valid_loss time

0.00% [0/52 00:00<?]
# dls_pred.show_batch(max_n=1)
# dls.show_batch(max_n=1)
model = SimpleRNN(2,1)
pred_callback = PredictionCallback(1)
pred_callback.init_normalize(dls.one_batch())
lrn = Learner(dls,model,loss_func=nn.MSELoss(),cbs=pred_callback)
lrn.fit(1)
0.00% [0/1 00:00<?]
epoch train_loss valid_loss time

0.00% [0/52 00:00<?]