Table Of Contents

Download

Get gensim from the Python Package Index or install directly with:

easy_install -U gensim

Questions? Suggestions?

Subscribe to Google group

You can also open an issue at the github issue tracker.

Gensim – Python Framework for Vector Space Modelling

What’s new?

For an overview of what you can (or cannot) do with gensim, go to the introduction.

For installation and troubleshooting, see the installation page and the gensim discussion group.

For examples on how to use it, try the tutorials.

When citing gensim in academic papers, please use this BibTeX entry.

Quick Reference Example

>>> from gensim import corpora, models, similarities
>>>
>>> # load corpus iterator from a Matrix Market file on disk
>>> corpus = corpora.MmCorpus('/path/to/corpus.mm')
>>>
>>> # initialize a transformation (Latent Semantic Indexing with 200 latent dimensions)
>>> lsi = models.LsiModel(corpus, numTopics=200)
>>>
>>> # convert another corpus to the latent space and index it
>>> index = similarities.MatrixSimilarity(lsi[another_corpus])
>>>
>>> # perform similarity query of a query in LSI space against the whole corpus
>>> sims = index[query]