Wednesday, April 02, 2008

[Gfx-cafe] GFX Cafe Seminar Friday April 4, 2008

GFX Café Seminar Friday April 4, 2008
12noon, ARTS Lab Black Box Theatre ** Note change of venue **

Food will be served

Using Visualization for Relevancy Feedback Tuning
of Text Analysis Algorithms
by Patricia Crossno, Sandia National Laboratories

The volume of data contained in textual form is enormous. Automated and
scalable methods are needed to evaluate the contents of document
collections without reading them. The ParaText project at Sandia National
Laboratories is creating a scalable text analysis engine that uses
statistical methods, such as Latent Semantic Analysis (LSA), to evaluate
the concepts found within a large corpus of documents. Using LSA to
extract concepts and relationships between documents, the corpus can be
interactively explored through a visual application where documents are
grouped by concept within a landscape metaphor.

As we are developing ParaText, we are using visualization to assess
the impact of various algorithmic choices on the relevancy of documents
returned by queries to the engine (i.e. we are assessing how the document-
concept relationships change with changing parameter values). We have
created a visual analytics tool, LSAView, for presenting statistical
information and correlations. LSAView uses multiple-linked views of
document-similarity graphs and the difference matrices between them to
enable exploration of various configurations. LSAView is a work in
progress, so we are just starting to use it to evaluate questions about
how altering the statistical bias of our matrices impacts selection

Patricia Crossno is a Senior Member of Technical Staff at Sandia National
Laboratories*. She received her B.S. degree in Computer Science at UNM
in 1982. For the next nine years, she worked for Digital Equipment
Corporation as a software engineer working on projects in image
processing, color compression, and graphics. In 1991, she completed her
M.S. degree in Computer Science at UNM. After finishing her doctorate in
Computer Science at UNM in 1998, she joined the Data Analysis and
Visualization department at Sandia. Her work has included isosurface
generation using particle systems, parallel marching cubes, visual
debugging, tensor visualization, GPU-accelerated volume rendering, and
visualizing temporal attributes using abstract metaphors. Currently she
is working on scalable text analysis and visualization of electrical
circuit simulation results.

*Sandia is a multi-program laboratory operated by Sandia Corporation, a
Lockheed Martin Company, for the United States Department of Energy under
contract DE-AC04-94AL85000.

Pradeep Sen
Assistant Professor
Advanced Graphics Lab
Dept. of Electrical & Computer Engineering
University of New Mexico

