Treffer: Hybridization of Acoustic and Visual Features of Polish Sibilants Produced by Children for Computer Speech Diagnosis.
J Acoust Soc Am. 2005 Oct;118(4):2570-8. (PMID: 16266177)
Sensors (Basel). 2023 Feb 17;23(4):. (PMID: 36850882)
Anesth Analg. 2018 May;126(5):1763-1768. (PMID: 29481436)
J Acoust Soc Am. 2018 Sep;144(3):1454. (PMID: 30424626)
J Acoust Soc Am. 2017 Jan;141(1):EL57. (PMID: 28147568)
Sci Rep. 2019 Mar 13;9(1):4329. (PMID: 30867443)
Psychol Bull. 1992 Jul;112(1):155-9. (PMID: 19565683)
J Speech Lang Hear Res. 2016 Aug 1;59(4):699-712. (PMID: 27537983)
IEEE Trans Image Process. 1998;7(11):1602-9. (PMID: 18276225)
J Imaging. 2022 Mar 14;8(3):. (PMID: 35324627)
Radiology. 2020 May;295(2):328-338. (PMID: 32154773)
Expert Rev Precis Med Drug Dev. 2016;1(2):207-226. (PMID: 28042608)
J Acoust Soc Am. 2018 Dec;144(6):3603. (PMID: 30599687)
Dev Neurorehabil. 2009 Apr;12(2):66-75. (PMID: 19340659)
Cancer Res. 2017 Nov 1;77(21):e104-e107. (PMID: 29092951)
J Nucl Med. 2020 Apr;61(4):488-495. (PMID: 32060219)
Weitere Informationen
Speech disorders are significant barriers to the balanced development of a child. Many children in Poland are affected by lisps (sigmatism)-the incorrect articulation of sibilants. Since speech therapy diagnostics is complex and multifaceted, developing computer-assisted methods is crucial. This paper presents the results of assessing the usefulness of hybrid feature vectors extracted based on multimodal (video and audio) data for the place of articulation assessment in sibilants /s/ and /ʂ/. We used acoustic features and, new in this field, visual parameters describing selected articulators' texture and shape. Analysis using statistical tests indicated the differences between various sibilant realizations in the context of the articulation pattern assessment using hybrid feature vectors. In sound /s/, 35 variables differentiated dental and interdental pronunciation, and 24 were visual (textural and shape). For sibilant /ʂ/, we found 49 statistically significant variables whose distributions differed between speaker groups (alveolar, dental, and postalveolar articulation), and the dominant feature type was noise-band acoustic. Our study suggests hybridizing the acoustic description with video processing provides richer diagnostic information.