interfaces

This module contains basic interfaces used throughout the whole gensim package.

The interfaces are realized as abstract base classes (ie., some optional functionality is provided in the interface itself, so that the interfaces can be subclassed).

class gensim.interfaces.CorpusABC

Interface for corpora. A corpus is simply an iterable, where each iteration step yields one document. A document is a list of (fieldId, fieldValue) 2-tuples.

See the corpora package for some example corpus implementations.

Note that although a default len() method is provided, it is very inefficient (performs a linear scan through the corpus to determine its length). Wherever the corpus size is needed and known in advance (or at least doesn’t change so that it can be cached), the len() method should be overridden.

classmethod load(fname)
Load a previously saved object from file (also see save).
save(fname)
Save the object to file via pickling (also see load).
class gensim.interfaces.TransformationABC

Interface for transformations. A ‘transformation’ is any object which accepts a sparse document via the dictionary notation [] and returns another sparse document in its stead.

See the tfidfmodel module for an example of a transformation.

apply(corpus)
Apply the transformation to a whole corpus (as opposed to a single document) and return the result as another another corpus.
classmethod load(fname)
Load a previously saved object from file (also see save).
save(fname)
Save the object to file via pickling (also see load).

Previous topic

API Reference

Next topic

utils

This Page