Introducing ALZET?ew Model 2006 Pump  Sign up for PNAS Online eTocs
Link: Info for AuthorsLink: Editorial BoardLink: AboutLink: SubscribeLink: AdvertiseLink: ContactLink: Sitemap Link: PNAS Home
Proceedings of the National Academy of Sciences
Link: Current Issue "" Link: Archives "" Link: Online Submission ""  Link: Advanced Search

Published online on March 22, 2004, 10.1073/pnas.0400341101
PNAS | April 6, 2004 | vol. 101 | Suppl. 1 | 5214-5219


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a colleague
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My File Cabinet
Right arrow Download to citation manager
Right arrow Request Copyright Permission
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via ISI Web of Science (8)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Landauer, T. K.
Right arrow Articles by Derr, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Landauer, T. K.
Right arrow Articles by Derr, M.
Related Content
Right arrow Related Web Pages
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg  
What's this?

 Previous Article  | Table of Contents |  Next Article 

COLLOQUIUM PAPERS
From paragraph to graph: Latent semantic analysis for information visualization

Thomas K. Landauer * {dagger} {ddagger}, Darrell Laham {dagger}, and Marcia Derr {dagger}

*Department of Psychology, University of Colorado, Boulder, CO 80309-0345; and {dagger}Knowledge Analysis Technologies, Boulder, CO 80301

Most techniques for relating textual information rely on intellectually created links such as author-chosen keywords and titles, authority indexing terms, or bibliographic citations. Similarity of the semantic content of whole documents, rather than just titles, abstracts, or overlap of keywords, offers an attractive alternative. Latent semantic analysis provides an effective dimension reduction method for the purpose that reflects synonymy and the sense of arbitrary word combinations. However, latent semantic analysis correlations with human text-to-text similarity judgments are often empirically highest at {approx}300 dimensions. Thus, two- or three-dimensional visualizations are severely limited in what they can show, and the first and/or second automatically discovered principal component, or any three such for that matter, rarely capture all of the relations that might be of interest. It is our conjecture that linguistic meaning is intrinsically and irreducibly very high dimensional. Thus, some method to explore a high dimensional similarity space is needed. But the 2.7 x 107 projections and infinite rotations of, for example, a 300-dimensional pattern are impossible to examine. We suggest, however, that the use of a high dimensional dynamic viewer with an effective projection pursuit routine and user control, coupled with the exquisite abilities of the human visual system to extract information about objects and from moving patterns, can often succeed in discovering multiple revealing views that are missed by current computational algorithms. We show some examples of the use of latent semantic analysis to support such visualizations and offer views on future needs.


This paper results from the Arthur M. Sackler Colloquium of the National Academy of Sciences, "Mapping Knowledge Domains," held May 9-11, 2003, at the Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering in Irvine, CA.

Abbreviations: LSA, latent semantic analysis; SVD, singular value decomposition; MeSH, medical subject heading; cos, cosine.

{ddagger} To whom correspondence should be addressed. E-mail: landauer{at}psych.colorado.edu.


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg    What's this?


Related Web Pages:

NAS Sackler Colloquium on Mapping Knowledge Domains

This article has been cited by other articles in HighWire Press-hosted journals:


Home page
BioinformaticsHome page
R. Homayouni, K. Heinrich, L. Wei, and M. W. Berry
Gene clustering by Latent Semantic Indexing of MEDLINE abstracts
Bioinformatics, January 1, 2005; 21(1): 104 - 115.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
K. W. Boyack
Mapping knowledge domains: Characterizing PNAS
PNAS, April 6, 2004; 101(suppl_1): 5192 - 5199.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
F. Menczer
Evolution of document networks
PNAS, April 6, 2004; 101(suppl_1): 5261 - 5265.
[Abstract] [Full Text] [PDF]