Audio-Aware Spoken Multiple-Choice Question Answering With Pre-Trained Language Models

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

Audio-Aware Spoken Multiple-Choice Question Answering With Pre-Trained Language Models

By: 
Chia-Chih Kuo; Kuan-Yu Chen; Shang-Bao Luo

Spoken multiple-choice question answering (SMCQA) requires machines to select the correct choice to answer the question by referring to the passage, where the passage, the question, and multiple choices are all in the form of speech. While the audio could contain useful cues for SMCQA, usually only the auto-transcribed text is utilized in model development. Thanks to the large-scaled pre-trained language representation models, such as the bidirectional encoder representations from Transformers (BERT), systems with only auto-transcribed text can still achieve a certain level of performance. However, previous studies have evidenced that acoustic-level statistics can offset text inaccuracies caused by the automatic speech recognition systems or representation inadequacy lurking in word embedding generators, thereby making the SMCQA system robust. Along the line of research, in this study, an audio-aware SMCQA framework is proposed. Two different mechanisms are introduced to distill the useful cues from speech, and then a BERT-based SMCQA framework is presented. In other words, the proposed SMCQA framework not only inherits the advantages of contextualized language representations learned by BERT but integrates the complementary acoustic-level information distilled from audio with the text-level information. A series of experiments demonstrates remarkable improvements in accuracy over selected baselines and SOTA systems on a published Chinese SMCQA dataset.

SPS on Twitter

  • Celebrate International Women's Day with SPS! This Tuesday, 8 March, join Dr. Neeli Prasad for "Unlocking the Poten… https://t.co/GDQIgjSpLs
  • Check out the SPS Education Short Courses, new at ! Earn PDH and CEU certificates by attending either in… https://t.co/1uYFNvltg7
  • We're partnering with the IEEE Humanitarian Activities on Wednesday, 2 March to bring you a new webinar, "Increasin… https://t.co/JzhaBl17UY
  • The DEGAS Webinar Series continues this Thursday, 3 March when Dr. Steven Smith present "Causal Inference on Networ… https://t.co/10kppomXdl
  • In the February issue of the Inside Signal Processing Newsletter, we talk to Dr. Oriol Vinyals, who discusses his j… https://t.co/XLQ7tpEq0A

SPS Videos


Signal Processing in Home Assistants

 


Multimedia Forensics


Careers in Signal Processing             

 


Under the Radar