sconce.data_feeds package

sconce.data_feeds.base module

class sconce.data_feeds.base.DataFeed(data_loader)[source]

Bases: object

A thin wrapper around a DataLoader that automatically yields tuples of torch.Tensor (that live on cpu or on cuda). A DataFeed will iterate endlessly.

Like the underlying DataLoader, a DataFeed’s __next__ method yields two values, which we refer to as the inputs and the targets.

Parameters:data_loader (DataLoader) – the wrapped data_loader.
batch_size

the wrapped data_loader’s batch_size

cuda(device=None)[source]

Put the inputs and targets (yielded by this DataFeed) on the specified device.

Parameters:device (int or bool or dict) – if int or bool, sets the behavior for both inputs and targets. To set them individually, pass a dictionary with keys {‘inputs’, ‘targets’} instead. See torch.Tensor.cuda() for details.

Example

>>> g = DataFeed.from_dataset(dataset, batch_size=100)
>>> g.cuda()
>>> g.next()
(Tensor containing:
 [torch.cuda.FloatTensor of size 100x1x28x28 (GPU 0)],
 Tensor containing:
 [torch.cuda.LongTensor of size 100 (GPU 0)])
>>> g.cuda(False)
>>> g.next()
(Tensor containing:
 [torch.FloatTensor of size 100x1x28x28],
 Tensor containing:
 [torch.LongTensor of size 100])
>>> g.cuda(device={'inputs':0, 'targets':1})
>>> g.next()
(Tensor containing:
 [torch.cuda.FloatTensor of size 100x1x28x28 (GPU 0)],
 Tensor containing:
 [torch.cuda.LongTensor of size 100 (GPU 1)])
dataset

the wrapped data_loader’s Dataset

classmethod from_dataset(dataset, split=None, **kwargs)[source]

Create a DataFeed from an instantiated dataset.

Parameters:
  • dataset (Dataset) – the pytorch dataset.
  • split (float, optional) – If not None, it specifies the fraction of the dataset that should be placed into the first of two data_feeds. The remaining data is used for the second data_feed. Both data_feeds will be returned.
  • **kwargs – passed directly to the DataLoader) constructor.
next()[source]
num_samples

the len of the wrapped data_loader’s Dataset

preprocess(inputs, targets)[source]
reset()[source]

Start iterating through the data_loader from the begining.

split(split_factor, validation_transform=None, **kwargs)[source]

Create a training and validation DataFeed from this one.

Parameters:
  • split_factor (float) – [0.0, 1.0] the fraction of the dataset that should be put into the new training feed.
  • validation_transform (callable) – override the existing validation transform with this.
  • **kwargs – passed directly to the DataLoader) constructor.
Returns:

training_feed, validation_feed

sconce.data_feeds.image module

sconce.data_feeds.single_class_image module

class sconce.data_feeds.single_class_image.SingleClassImageFeed(data_loader)[source]

Bases: sconce.data_feeds.image.ImageFeed

An ImageFeed class for use when each image belongs to exactly one class.

classmethod from_image_folder(root, loader_kwargs=None, **dataset_kwargs)[source]

Create a Datafeed from a folder of images. See torchvision.datasets.ImageFolder.

Parameters:
  • root (path) – the root directory path.
  • loader_kwargs (dict) – keyword args provided to the DataLoader constructor.
  • **dataset_kwargs – keyword args provided to the torchvision.datasets.ImageFolder constructor.
classmethod from_torchvision(batch_size=500, data_location=None, dataset_class=<class 'torchvision.datasets.mnist.MNIST'>, fraction=1.0, num_workers=0, pin_memory=True, shuffle=True, train=True, transform=ToTensor())[source]

Create a Datafeed from a torchvision dataset class.

Parameters:
  • batch_size (int) – how large the yielded inputs and targets should be. See DataLoader for details.
  • data_location (path) – where downloaded dataset should be stored. If None a system dependent temporary location will be used.
  • dataset_class (class) – a torchvision dataset class that supports constructor arguments {‘root’, ‘train’, ‘download’, ‘transform’}. For example, MNIST, FashionMnist, CIFAR10, or CIFAR100.
  • fraction (float) – (0.0 - 1.0] how much of the original dataset’s data to use.
  • num_workers (int) – how many subprocesses to use for data loading. See DataLoader for details.
  • pin_memory (bool) – if True, the data loader will copy tensors into CUDA pinned memory before returning them. See DataLoader for details.
  • shuffle (bool) – set to True to have the data reshuffled at every epoch. See DataLoader for details.
  • train (bool) – if True, creates dataset from training set, otherwise creates from test set.
  • transform (callable) – a function/transform that takes in an PIL image and returns a transformed version.

sconce.data_feeds.multi_class_image module

class sconce.data_feeds.multi_class_image.MultiClassImageFeed(data_loader)[source]

Bases: sconce.data_feeds.image.ImageFeed

An ImageFeed class for use when each image may belong to more than one class.