Speaker Detection in Broadcast Speech Databases

dc.citation.bibtexNameinproceedingsen_US
dc.citation.conferenceNameProceedings of International Conference on Spoken Language Processsingen_US
dc.contributor.authorRosenberg, Aaronen_US
dc.contributor.authorMagrin-Chagnolleau, Ivanen_US
dc.contributor.authorParthasarathy, S.en_US
dc.contributor.orgDigital Signal Processing (http://dsp.rice.edu/)en_US
dc.date.accessioned2007-10-31T01:03:05Zen_US
dc.date.available2007-10-31T01:03:05Zen_US
dc.date.issued1998-01-15en_US
dc.date.modified2004-11-04en_US
dc.date.note2004-01-14en_US
dc.date.submitted1998-01-15en_US
dc.descriptionConference Paperen_US
dc.description.abstractExperiments have been carried out to assess the feasibility of detecting target speaker segments in multi-speaker broadcast databases. The experiemental database consists of NBC Nightly News broadcasts. The target speaker is the news anchor, Tom Brokaw. Gaussian mixture models are constructed from labelled training data for the target speaker as well as background models for other speakers, commercials, and music. Four labelled 30-min. broadcasts are used for testing. Mel-frequency cepstral features, augmented by delta cepstral features are calculated over 20 msec. windows shifted every 10 msec. through a broadcast. Likelihood ratio scores are calculated for each test frame averaged over blocks of frames with a specified duration. The block scores are input to a detection routine which returns estimates of target segments boundaries. The range of best results obtained over the test broadcasts is 82% to 100% detection of target segments with segment frame accuracy ranging from 86% to 95%. 0 to 2 false alarm segments are detected over each 30 min. broadcast.en_US
dc.identifier.citationA. Rosenberg, I. Magrin-Chagnolleau and S. Parthasarathy, "Speaker Detection in Broadcast Speech Databases," 1998.en_US
dc.identifier.urihttps://hdl.handle.net/1911/20304en_US
dc.language.isoengen_US
dc.subjectTemporaryen_US
dc.subject.keywordTemporaryen_US
dc.subject.otherSignal Processing Applicationsen_US
dc.titleSpeaker Detection in Broadcast Speech Databasesen_US
dc.typeConference paperen_US
dc.type.dcmiTexten_US
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Ros1998Non5SpeakerDe.PDF
Size:
146.06 KB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
Ros1998Non5SpeakerDe.PS
Size:
295.39 KB
Format:
Postscript Files