Treffer: Speaker dependent voice recognition with word-tense association and part-of-speech tagging

Title:
Speaker dependent voice recognition with word-tense association and part-of-speech tagging
Added Details:
Coyle, Eric J., advisor
Embry-Riddle Aeronautical University. Department of Mechanical Engineering.
Call Numbers:
TK7895.S65 E76 2014eb
Physical Description:
1 online resource (xi, 70 leaves) : illustrations (chiefly color)
Availability:
Open access content. Open access content
Note:
Also available in print.
"Daytona Beach, Florida, December 2014."
Includes bibliographical references (leaves 57-61).
Voice recognition ; Natural language processing ; Relation to existing research ; Applications -- Noise filtering. Basics of digital signal processing and filtration ; Butterworth filter ; Data and results --Speech segmentation. Short time energy & zero - crossing rate with data & results -- Speaker recognition. Data collection process ; Neural networks ; Data & results -- Language modelling. Relation to previous sections ; Part - of - speech tagging ; Word - tense tagging -- Conclusions & recommendations -- References -- Appendix A. Internal review board application & paragraph used for audio recitation -- Appendix B. Speaker recognition and audio feature system graphic user interface.
Other Numbers:
FER oai:commons.erau.edu:edt-1271
1014343578
Contributing Source:
From OAIster®, provided by the OCLC Cooperative.
Accession Number:
edsoai.on1014343578
Database:
OAIster

Weitere Informationen

This thesis deals with speaker recognition and natural language processing. The most common speaker recognition systems are Text-Dependent and identify the speaker after a key word/phrase is uttered. This thesis presents Text-Independent Speaker recognition systems that incorporate the collaborative effort and research of noise-filtering, Speech Segmentation, Feature extraction, speaker verification and finally, Partial Language Modelling. The filtering process was accomplished using 4th order Butterworth Band-pass filters to dampen ambient noise outside normal speech frequencies of 300Hzto3000Hz. Speech segmentation utilizes Hamming windows to segment the speech, after which speech detection occurs by calculating the Short time Energy and Zero-crossing rates over a particular time period and identifying voices from unvoiced using a threshold. Audio data collected from different people is run consecutively through a Speaker Training and Recognition Algorithm which uses neutral networks to create a training group and target group for the recognition process. The output of the segmentation module is then processed by the neutral network to recognize the speaker. Though not implemented here due to database and computational requirements, the last module suggests a new model for the Part of Speech tagging process that involves a combination of Artificial Neural Networks (ANN) and Hidden Markov Models (HMM) in a series configuration to achieve higher accuracy. This differs from existing research by diverging from the usual single model approach of the creation of hybrid ANN and HMM models.