Extracting Semantically-Coherent Keyphrases from Speech

  1. Get@NRC: Extracting Semantically-Coherent Keyphrases from Speech (Opens in a new window)
AuthorSearch for: ; Search for:
Journal titleJournal of the Canadian Acoustics Association
Pages130131; # of pages: 2
AbstractBrowsing through large volumes of spoken audio is known to be a challenging task for end users. One way to facilitate this task is to provide keyphrases extracted from the audio, thus allowing users to quickly get the gist of the audio document or sections of it.<br /><br /> Previous methods for extracting keyphrases from spoken audio have used text-based summarization techniques on automatic speech transcription. The method of Désilets et al (2000) was found to produce accurate keyphrases for transcriptions with Word Error Rates (WER) of the order of 25%, but performance was less than ideal for transcripts with WERs of the order of 60%. With such transcripts, a large proportion of the extracted keyphrases included serious transcription errors.<br /><br /> In this paper, we extend thos previous methods by taking advantage of the fact that the mistranscribed keyphrases tend to have a low semantic coherence with the correctly transcribed ones. We measure semantic cohesiveness by computing Pointwise Mutual Information (PMI) of phrases in a large Terabyte corpus, and use that measure to filter semantic outliers from the list of extracted keyphrases. We evaluated the effectiveness of the technique and found that it removes half of the mistranscribed keyphrases, while removing at most 15% of correctly transcribed keyphrases.
Publication date
AffiliationNational Research Council Canada; NRC Institute for Information Technology
Peer reviewedNo
NRC number47387
NPARC number5765099
Export citationExport as RIS
Report a correctionReport a correction
Record identifier1d95b252-4d49-4408-90ac-7b35acd957ec
Record created2009-03-29
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)
Date modified: