datahawk
{% if error_message %}
{{ error_message }}
{% endif %}
Data Path
i
The path to the data file or dataset. Currently supported:
Local files:
Absolute path or relative path from where the app was run.
HuggingFace:
Tag used to identify the dataset on HuggingFace Datasets, e.g.,
openai/gsm8k
.
Read Mode
i
Choose whether to load the data entirely in memory or stream it.
Load
Stream
Source Type
i
Select the source of data:
JSONL:
For local JSONL files.
JSON:
For local JSON files. The dataset should be an array of JSONs at the top level.
HuggingFace:
For local or remotely located HuggingFace datasets. If stored locally, specify path to directory containing
*.parquet
files.
JSONL
JSON
HuggingFace
Split
i
Specify the split for HuggingFace datasets, e.g.,
train
,
test
, etc.
Leave empty if not applicable.
Config name
i
Name of the dataset configuration, e.g.,
main
or
socratic
for
openai/gsm8k
.
Leave empty if not applicable.
Submit