RemotePathDataset

class pyremotedata.dataloader.RemotePathDataset(remote_path_iterator: RemotePathIterator, prefetch: int = 64, transform: Callable | None = None, target_transform: Callable | None = None, device: device | None = None, dtype: dtype | None = None, hierarchical: int = 0, hierarchy_parser: Callable | None = None, shuffle: bool = False, return_remote_path: bool = False, return_local_path: bool = False, verbose: bool = False)[source]

Bases: IterableDataset

Creates a torch.utils.data.IterableDataset from a pyremotedata.implicit_mount.RemotePathIterator.

By default the dataset will return the image as a tensor and the remote path as a string.

Hierarchical mode

If hierarchical >= 1, the dataset is in “Hierarchical mode” and will return the image as a tensor and the label as a list of integers (class indices for each level in the hierarchy).

The class_handles property can be used to get the class-idx mappings for the dataset.

By default the dataset will use a parser which assumes that the hierarchical levels are encoded in the remote path as directories like so:

…/level_n/…/level_1/level_0/image.jpg

Where n = (hierarchical - 1) and level_0 is the leaf level.

Parameters:
  • remote_path_iterator (RemotePathIterator) – The pyremotedata.implicit_mount.RemotePathIterator to create the dataset from.

  • prefetch (int) – The number of items to prefetch from the pyremotedata.implicit_mount.RemotePathIterator.

  • transform (callable, optional) – A function/transform that takes in an image as a torch.Tensor and returns a transformed version.

  • target_transform (callable, optional) – A function/transform that takes in the label (after potential parsing by parse_hierarchical) and transforms it.

  • device (torch.device, optional) – The device to move the tensors to.

  • dtype (torch.dtype, optional) – The data type to convert the tensors to.

  • hierarchical (int, optional) – The number of hierarchical levels to use for the labels. Default: 0, i.e. no hierarchy.

  • hierarchy_parser (callable, optional) – A function to parse the hierarchical levels from the remote path. Default: None, i.e. use the default parser.

  • return_remote_path (bool, optional) – Whether to return the remote path. Default: False.

  • return_local_path (bool, optional) – Whether to return the local path. Default: False.

  • verbose (bool, optional) – Whether to print verbose output. Default: False.

Yields:

(tuple)

A tuple containing the following elements:
  • (torch.Tensor): The image as a tensor.

  • (Union[str, List[int]]): The label as the remote path or as a list of class indices.

  • (Optional[str]): The local path, if return_local_path is True.

  • (Optional[str]): The remote path, if return_remote_path is True.