(310) previous ~ index ~ next

To: tdt-distrib@ldc.upenn.edu
From: James Allan <allan@cs.umass.edu>
Subject: Possibly interesting workshop
Date: Tue, 13 Feb 2001 19:38:45 -0500

TDT folk,

The attached is a workshop-being-proposed at SIGIR this summer. The
proposers are interested in knowing who might attend (not a
commitment, just a statement of interest). If it looks interesting to
you, please contact them (not me).....
		    -- james



From: "Anni R Coden" <anni@us.ibm.com>
Date: Mon, 12 Feb 2001 10:43:53 -0500

Hello,

We plan to submit the workshop proposal described below to ACM SIGIR'01
(see http://www.sigir2001.org/) and are currently trying to gauge potential
interest for it in the research community. In a nutshell, the workshop
aims to explore speech recognition applications, issues, and problems in
the context of information retrieval and text analysis. If you would be
interested in attending or submitting a paper to the workshop, please reply
to this note and let us know, if possible by February 21st. Also, please
forward this to anyone else who might be interested in the topic.

Looking forward to hearing from you,

Anni, Savitha and Eric



Information Retrieval Techniques for Speech Applications
(SIGIR'2001 Workshop Proposal Draft)


Until recently, automatic speech recognition was more of a research
interest than a viable commercial application. Speech recognition
technology has now matured to the point where speech can be used to
interact with automated phone systems, control computer programs, and even
create memos and documents. Moving beyond computer control and dictation,
speech recognition has the potential to dramatically change the way we
create, capture, and store knowledge. Advances in speech recognition
technology combined with ever decreasing storage costs and processors that
double in power every eighteen months have set the stage for a whole new
era of applications that treat speech in the same way that we currently
treat text. The goal of this workshop is to explore the technical issues
involved in applying information retrieval and text analysis technologies
in the new application domains enabled by automatic speech recognition.

These possibilities, however, bring with them a number of issues,
questions, and problems. Speech-based user interfaces create different
expectations for the end user, which in turn places different demands on
the back-end systems that must interact with the user and interpret the
user's commands. Speech recognition will never be perfect, so analyses
applied to the resulting transcripts must be robust in the face of
recognition errors. The ability to capture speech and apply speech
recognition on smaller, more powerful, pervasive devices suggests that text
analysis and mining technologies can be applied in new domains never before
considered.

The types of speech recognition applications range from dictation systems,
to conversational or transactional systems, and audio indexing systems.
Another means of categorizing speech applications is into "active" and
"passive" categories. Active speech applications include explicit spoken
dialog with the system where speech input is used to drive the application.
Dictation and conversational applications fall in this category. Passive
speech applications analyze speech input and the results may be used as a
resource by an application. Speech mining and audio indexing applications
fall into this category. This distinction is important because active
applications emphasize speech interface and dialog design where as passive
applications emphasize underlying information retrieval and text analysis.

In this workshop we would like to explore techniques in information
retrieval and text analysis that meet the challenges in the new application
domains enabled by automatic speech recognition.
Specifically, we would like to focus on:

1. What new IR related applications, problems, or opportunities are
created by effective, real-time speech recognition?
2. To what extent are information retrieval methods that work on perfect
text applicable to imperfect speech transcript?
3. What additional data representations from a speech engine may be
exploited by applications?
4. Does domain knowledge (context/voice-id) help and can it be
automatically deduced?
5. Can some of the techniques explored be beneficial in a standard IR
application?
6. What constraints are imposed by real time speech applications?
7. Case studies of specific speech applications - either successful or
not.

The workshop will include a keynote address, a panel discussion, reviewed
paper presentations, and demos of working prototypes. All attendees should
submit a short abstract on why this topic is of interest to them, and those
wishing to submit a paper or demo should submit a 3-5 page paper.


Keynote Address:
James Allan, UMass, Amherst


Program Committee Members:
John Garofolo, NIST
Alex Hauptmann, CMU
Alan Smeaton, Dublin City University
Justin Zobel, RMIT Australia


Co-organizers:
Anni Coden, IBM T.J. Watson Research Center
Savitha Srinivasan, IBM Almaden Research Center
Eric Brown, IBM T.J. Watson Research Center


CV:

Anni Coden is a Research Staff Member at the IBM T.J. Watson Research
Center. Her most recent work focused on determining in real-time collateral
information to a Video BroadCast. Jointly with Eric Brown, she developed an
architecture and system to support real-time speech applications like data
broadcast and meeting mining. Her other recent work included
Question-Answering, and Video Indexing and Searching. Anni holds multiple
patents, several in the areas of multi-media search technologies and speech
technology and published extensively in different areas of computer
science. Anni got her PhD from MIT in 1981 in Computer Science and after
spending a year at MIT as a Research Scientist, joined IBM.

Savitha Srinivasan is the manager of the Multimedia Knowledge Discovery
group at IBM Almaden Research Center. She originally joined the speech
group after an MS in Computer Science at IBM T.J. Watson Research Center in
1990 as a speech applications researcher. She worked on the design and
development of IBM's speech recognition products such as the IBM Speech
Server Series and Medspeak Radiology. She moved to IBM Almaden Research in
1997 and has since then been working on applications of speech recognition
to multimedia indexing. She has several patents and publications in speech
recognition related applications and technology, and in multimedia
information retrieval.

Eric Brown is a Research Staff Member at the IBM T.J. Watson Research
Center, Hawthorne, NY. Eric earned his PhD in Computer Science from the
University of Massachusetts, Amherst. Bruce Croft advised Eric's thesis
work. Eric has worked on performance issues in IR systems, parallel and
distributed text search, text categorization, question answering, and
applications of automatic speech recognition to knowledge management
problems. Eric has published in numerous database and information
retrieval conferences, won the Best Student Paper award at SIGIR'95, was
Volunteers Chair for SIGIR'97, co-chaired the Hypertext IR workshop at
SIGIR'98, is currently the Membership Liaison for SIGIR, and is a member of
the ACM and Sigma Xi.

- ------------------------------------------------------------------------------------

Dr. Anni R.Coden
T.J. Watson Research Center
30 Sawmill River Road 1S-A16
Hawthorne, NY 10532
914 784 7073, t/l 863 7073 (tel)
914 784 6307 (fax)
anni@us.ibm.com
(310) previous ~ index ~ next

Last updated Mon Mar 5 14:36:37 2001