The InPhO Topic Explorer (InPhO-TE) software is © 2014-16 Trustees of Indiana University and the Indiana Philosophy Ontology (InPhO) Project.
While copyright is reserved, the software itself is released via the permissive MIT License, with open-source code available via GitHub.
Explicit rights are granted for reuse of all visualizations and other derivatives of this software via the Creative Commons Attribution 4.0 International License (CC-BY).
To satisfy the license in electronic works, please link to the primary project page at http://inphodata.cogs.indiana.edu/. If you use InPhO-TE in a published work, the authors request that you cite InPhO-TE as:
Jaimie Murdock and Colin Allen. (2015) "Visualization Techniques for Topic Model Checking" in Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI-15). Austin, Texas, USA, January 25-29, 2015. http://inphodata.cogs.indiana.edu/
Content for each corpus is copyright by their respective creators. Topic models performed using this content fall under fair use provisions as non-consumptive research.
This is the InPhO Topic Explorer. It allows you to explore a collection of documents using pre-trained topic models.
Topic models represent documents as mixtures of topics. The topic model is trained by an algorithm that finds latent (hidden) relationships among the documents. Typically we train models with different numbers of topics, which show different levels of detail in the documents. A model with more topics will, in general, show finer details.
If you would like more information about topic modeling, you can watch the video at http://inphodata.cogs.indiana.edu/ and read the article by David Blei (2012).
From this page you can explore the topic models in various ways using the large buttons in the middle of the page and the menu items on the left. On this page, you may also find some information about the specific corpus used to train the models, if the person who trained the models has provided it.
The most effective way to start exploring via the documents in the corpus.
Once you click on the document icon a panel will appear in which you can either click the crossed arrows button for a random document or select a focal document for your exploration by typing a few letters to match the title.
You will be shown a “fingerprint” for the selected document, and you can navigate from this to the Hypershelf focused on that document.
Clicking here will bring up a panel where you can enter a few words to identify related topics. From there you can go to the topic map that lays out and groups the topics from all the models according to their similarity. There is also additional help once you reach that page.
Clicking on a number will take you to the Hypershelf for the model with the corresponding number of topics. Use the help function on that page to find out what to do next.
Clicking on the Topics (vertical bars) icon will take you to the topics map
This is the “Hypershelf”. It allows you to explore documents in the corpus using pre-trained topic models. If you are unfamiliar with topic modeling, you can watch the video at Inphodata.cogs.indiana.edu and read an article by David Blei (2012).
The Hypershelf shows up to 40 documents that are most similar to the focal document. Each document is represented by a bar whose colors show the mixture and proportions of topics assigned to each document by the training process. The relative lengths of the bars indicate the degree of similarity to the focal document according to the topic mixtures.
Rolling over a colored segment shows the highest probability words associated with the topic. The key on the right shows all the topics identified by the model. If you click on a topic in the bar or the key, the display will sort the current documents ranked according to that topic. In this topic-sorted mode, a Top Documents button appears at the top that lets you retrieve the documents from the entire corpus that are most similar to that topic.
To select a new focal document you can:
You may use the button to the right of the random document button to visualize the focal document and you may use the dropdown menu attached to the button to switch to a model with a different number of topics.
Below the key are some additional display options that let you sort the displayed documents alphabetically, or to normalize the bar lengths so that you can compare the document mixtures more directly.
Other icons to the left of each topic bar allow you to view the document contents, or see a "fingerprint" of the topic mixtures for that document in all the available models with different numbers of topics. Clicking on a bar in the fingerprint will take you to a hypershelf focused on the selected document with that given model.
The numbers in the menu on the left can be used to navigate directly to a model with that number of topics.
Above the numbers on the left, the topic cluster button will take you to a different interface that lets you explore topic similarity across the models.
The home button at the top left will take you to a general information page about the corpus and models.
This is the topic map. It places the topics from the all the trained models on a two dimensional map that attempts to place similar topics close to each other.
The clusters and colors are determined automatically by an algorithm, and providing only a rough guide to groups of topics that have similar themes. The different axes also do not have any intrinsic meaning, but may be surprisingly interpretable.
You can control which models are included in the map by clicking on the numbers on the left to toggle the corresponding models off and on.
You may also enter words in the search box to have the isomap change shading to help you find related topics.
Clicking on any topic circle will take you to the Hypershelf with the top documents for that topic already selected.