--- title: Yoochoose keywords: fastai sidebar: home_sidebar summary: "Yoochoose dataset." description: "Yoochoose dataset." nb_path: "nbs/datasets/datasets.yoochoose.ipynb" ---
{% raw %}
{% endraw %} {% raw %}
{% endraw %} {% raw %}

class YoochooseDataset[source]

YoochooseDataset(root, min_session_length:int=2, min_item_support:int=5, eval_sec:int=86400) :: SessionDataset

Session data base class.

Args: min_session_length (int): Minimum number of items for a session to be valid min_item_support (int): Minimum number of interactions for an item to be valid eval_sec (int): these many seconds from the end will be taken as validation data

References:

1. https://github.com/Ethan-Yys/GRU4REC-pytorch-master/blob/master/preprocessing.py
{% endraw %} {% raw %}
{% endraw %} {% raw %}
ds = YoochooseDataset(root='/content/yoochoose')
Processing...
Training Set has 31637239 Events, 7966257 Sessions, and 37483 Items


Validation Set has 71222 Events, 15324 Sessions, and 6751 Items


Done!
{% endraw %} {% raw %}
!tree --du -h -C /content/yoochoose
/content/yoochoose
├── [995M]  processed
│   ├── [993M]  yoochoose_train.txt
│   └── [2.3M]  yoochoose_valid.txt
└── [1.4G]  raw
    └── [1.4G]  rsc15-clicks.dat

 2.4G used in 2 directories, 3 files
{% endraw %}