An Overview of the AT&T Spoken Document Retrieval System

Abstract

We present an overview of a spoken document retrieval system developed at AT&T Labs-Research for the HUB4 Broadcast News corpus. This overview includes a description of the intonational phrase boundary detection, classification, speech recognition, information retrieval and user interface components of the system, along with updated system assessments based on the 49-query task defined for the TREC-6 SDR track. Results from a comparative ranking study, based on queries taken from AP Newswire headlines from the same time period that the Broadcast News corpus was recorded, are presented. For the AP task, retrieval accuracy is assessed by comparing the documents retrieved from ASR generated transcriptions with those from human generated transcriptions.

Description
Conference Paper
Advisor
Degree
Type
Conference paper
Keywords
Temporary
Citation

J. Choi, D. Hindle, J. Hirschberg, I. Magrin-Chagnolleau, C. Nakatani, F. Pereira, A. Singhal and S. Whittaker, "An Overview of the AT&T Spoken Document Retrieval System," 1998.

Has part(s)
Forms part of
Published Version
Rights
Link to license
Citable link to this page