Treffer: Disfluency processing for cascaded speech translation involving English and Indian languages.
Weitere Informationen
Disfluencies are common in spontaneous speech and can significantly impact the accuracy of automatic speech translation when a spoken text is used as is as input. We address this issue by implementing two different approaches in our cascaded speech translation system. First, we identify and process the disfluencies in the spoken text before feeding the transcript to machine translation. Second, we train the machine translation system to be aware of disfluencies, enabling it to handle disfluencies and accurately translate both fluent and disfluent texts. We observe improvements of up to + 2.39 BLEU points (or + 0.90 COMET points) in speech translation from English to Bangla, Gujarati, Hindi, Kannada, Malayalam, Marathi, Tamil and Telugu languages when the state-of-the-art disfluency identifier system is used to preprocess disfluencies on our speech translation testset. We employed a synthetic disfluency corpora creation algorithm to augment existing machine translation parallel corpora involving English and 11 Indian languages. The machine translation system trained on it can handle the disfluencies inherent in spoken text and produce accurate translations. When applied to our speech translation test-set, this approach results in improvements of up to + 1.90 BLEU points (or + 0.60 COMET points) for translations from English to Bangla, Gujarati, Hindi, Kannada, Malayalam, Marathi, Tamil, and Telugu on our developed speech translation test-set. [ABSTRACT FROM AUTHOR]