ISBN-13: 9783847372905 / Angielski / Miękka / 2012 / 168 str.
This thesis proposes a new approach to improve the performance of Tamil speech recognition using language models. The main contribution of this thesis is the development of language models to capture co-occurrence patterns of partially free word order languages like Tamil. The models designed used sub-word units such as phonemes, syllables and morphemes as basic components of the language model. In addition the thesis explains the use of various language models at different levels of error correction. Language models based on different sub-word units, sub-word unit features and contexts designed to capture the characteristics of the Tamil language were used in this work to improve error correction rate of Tamil speech recognition system. The language models described in this work are not word dependent, but based on sub-word units like phoneme, syllable and morpheme resulting in essentially capturing vocabulary independent linguistic co-occurrences of the language under consideration. Therefore the language model based error correction discussed in this work performs a step-by-step sub-word unit-based error correction, which is also vocabulary independent.