Download

Get gensim version 0.8.6 from the Python Package Index or install directly with:

easy_install -U gensim

Table Of Contents

Questions? Suggestions?

Join the Google discussion group

Check the source code at Github. Report bugs at the issue tracker.

Gensim – Topic Modelling for Humans

algorithms analysis answer api collection concepts corpus design documents features framework human index infer install introduction latent dirichlet allocation model open-source paragraphs python query questions random reference representation semantic similar space sparse structure SVD text thought topic training tutorials unsupervised vector words

Quick Reference Example

>>> from gensim import corpora, models, similarities
>>>
>>> # Load corpus iterator from a Matrix Market file on disk.
>>> corpus = corpora.MmCorpus('/path/to/corpus.mm')
>>>
>>> # Initialize a transformation (Latent Semantic Indexing with 200 latent dimensions).
>>> lsi = models.LsiModel(corpus, num_topics=200)
>>>
>>> # Convert another corpus to the latent space and index it.
>>> index = similarities.MatrixSimilarity(lsi[another_corpus])
>>>
>>> # determine similarity of a query document against each document in the index
>>> sims = index[query]

What’s new?