JAIST Repository >
b. 情報科学研究科・情報科学系 >
b10. 学術雑誌論文等 >
b10-1. 雑誌掲載論文 >
このアイテムの引用には次の識別子を使用してください:
http://hdl.handle.net/10119/18719
|
タイトル: | Increasing speech intelligibility and naturalness in noise based on concepts of modulation spectrum and modulation transfer function |
著者: | Ngo, Thuanvan Kubo, Rieko Akagi, Masato |
キーワード: | Modulation transfer function modulation spectrum intelligibility |
発行日: | 2021-10-01 |
出版者: | Elsevier |
誌名: | Speech Communication |
巻: | 135 |
開始ページ: | 11 |
終了ページ: | 24 |
DOI: | 10.1016/j.specom.2021.09.004 |
抄録: | This study focuses on identifying effective features for controlling speech to increase speech intelligibility under adverse conditions. Previous approaches either cancel noise throughout speech presentation or preprocess speech by controlling its intensity and/or spectra. Among them, a method based on modulation transfer function theory, inverting the environmental effects to anticipate attenuation of speech modulation spectrum, shows excellent potential due to its systematic and explicit derivation of intelligibility enhancement against environmental smears. However, strictly following the inverse modulation transfer function is dangerous and ineffcient as important speech features can be damaged, and it costs lots of energy to boost all smeared regions. This study takes a different approach: analyzing the relations of smeared modulation spectra by the environments for intelligibility to extract effective modifying features. First, we conduct listening tests for intelligibility in noise with different types of enhanced speech. Next, we extract acoustic and modulation frequency components in the smeared modulation spectra by noise showing high correlation with intelligibility scores. Finally, we examine the intelligibility benefits of modifying these components by performing listening tests. The results show that these components effectively increase intelligibility by at most 18%, which demonstrates that our concept is valid. |
Rights: | Copyright (C)2021, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license (CC BY-NC-ND 4.0). [http://creativecommons.org/licenses/by-nc-nd/4.0/] NOTICE: This is the author's version of a work accepted for publication by Elsevier. Thuanvan Ngo, Rieko Kubo, Masato Akagi, Speech Communication 135, 2021, 11-24, https://doi.org/10.1016/j.specom.2021.09.004 |
URI: | http://hdl.handle.net/10119/18719 |
資料タイプ: | author |
出現コレクション: | b10-1. 雑誌掲載論文 (Journal Articles)
|
このアイテムのファイル:
ファイル |
記述 |
サイズ | 形式 |
M-AKAGI-I-1115.pdf | | 3082Kb | Adobe PDF | 見る/開く |
|
当システムに保管されているアイテムはすべて著作権により保護されています。
|