divina#
Date: Feb 07, 2023
Useful links: Binary Installers | Source Repository | Issues & Ideas | Q&A Support
divina
is an open source, BSD3-licensed library providing scalable and hyper-interpretable causal forecasting capabilities written in Python and consumable via CLI.
The aim of divina
is to deliver performance-oriented and hypter-interpretable exogenous time series forecasting models by producing accurate and bootstrapped predictions, local and overridable factor summaries and easily configurable feature engineering and experiment management capabilities.
Ensemble Architecture#
divina
is essentially a convenience wrapper that facilitates training, prediction, validation and deployment of an ensemble consisting of a causal, interpretable model that is boosted by an endogenous time-series model, allowing for high levels of automation and accuracy while still emphasizing and relying on the causal relationships discovered by the user. This ensemble structure is delivered with swappable model types to be able to suit many different kinds of forecasting problems. divina
is also fully integrated with both Dask and Prefect meaning that distributed compute and pipeline orchestration can be enabled with the flip of a switch. For more information of divina
’s features, check out the quickstart page.
Installation#
divina
is available via pypi and can be installed using the python package manager pip as shown below.
pip install divina
Getting Started#
To train and predict using a divina
pipeline, we first create a pandas
dataframe full of dummy data, convert that to a dask dataframe, and call the fit() method of our pipeline. Once the pipeline is fit, it can be used to predict on out-of-sample feature sets.
import dask.dataframe as dd
import pandas as pd
from divina import Divina
example_data = pd.DataFrame(
data=[
["2011-01-01", 3, 6],
["2011-01-02", 2, 4],
["2011-01-03", 8, 6],
["2011-01-04", 1, 1],
["2011-01-05", 2, 3],
],
columns=["a", "b", "c"],
)
example_data_dask = dd.from_pandas(example_data, npartitions=1)
example_pipeline = Divina(target="c", time_index="a", frequency="D")
y_hat_insample = example_pipeline.fit(example_data_dask)[
0
].causal_validation.predictions
y_hat_out_of_sample = example_pipeline.predict(
example_data_dask.drop(columns="c")
).causal_predictions.predictions