Predicting Prosody from Text for Text-to-Speech Synthesis covers the specific aspects of prosody, mainly focusing on how to predict the prosodic information from linguistic text, and then how to exploit the predicted prosodic knowledge for various speech applications. Author K. Sreenivasa Rao discusses proposed methods along with state-of-the-art techniques for the acquisition and incorporation of prosodic knowledge for developing speech systems.
Positional, contextual and phonological features are proposed for representing the linguistic and production constraints of the...
Predicting Prosody from Text for Text-to-Speech Synthesis covers the specific aspects of prosody, mainly focusing on how to predict the pros...
"Emotion Recognition Using Speech Features" provides coverage of emotion-specific features present in speech. The author also discusses suitable models for capturing emotion-specific information for distinguishing different emotions. The content of this book is important for designing and developing natural and sophisticated speech systems. In this Brief, Drs. Rao and Koolagudi lead a discussion of how emotion-specific information is embedded in speech and how to acquire emotion-specific knowledge using appropriate statistical models. Additionally, the authors provide information about...
"Emotion Recognition Using Speech Features" provides coverage of emotion-specific features present in speech. The author also discusses suitable model...
In this brief, the authors discuss recently explored spectral (sub-segmental and pitch synchronous) and prosodic (global and local features at word and syllable levels in different parts of the utterance) features for discerning emotions in a robust manner. The authors also delve into the complementary evidences obtained from excitation source, vocal tract system and prosodic features for the purpose of enhancing emotion recognition performance. Features based on speaking rate characteristics are explored with the help of multi-stage and hybrid models for further improving emotion recognition...
In this brief, the authors discuss recently explored spectral (sub-segmental and pitch synchronous) and prosodic (global and local features at word an...
Thisbook focuses on speech processing in the presence of low-bit rate coding and varying background environments. The methods presented in the book exploit the speech events which are robust in noisy environments. Accurate estimation of these crucial events will be useful for carrying out various speech tasks such as speech recognition, speaker recognition and speech rate modification in mobile environments. The authors provide insights into designing and developing robust methods to process the speech in mobile environments. Covering temporal and spectral enhancement methods to minimize the...
Thisbook focuses on speech processing in the presence of low-bit rate coding and varying background environments. The methods presented in the book ex...
This book discusses speaker recognition methods to deal with realistic variable noisy environments. The text covers authentication systems for; robust noisy background environments, functions in real time and incorporated in mobile devices. The book focuses on different approaches to enhance the accuracy of speaker recognition in presence of varying background environments. The authors examine: (a) Feature compensation using multiple background models, (b) Feature mapping using data-driven stochastic models, (c) Design of super vector- based GMM-SVM framework for robust speaker recognition,...
This book discusses speaker recognition methods to deal with realistic variable noisy environments. The text covers authentication systems for; robust...
K. Sreenivasa Rao Subrayal Medapati Reddy Sudhamay Maity
This book discusses the impact of spectral features extracted from frame level, glottal closure regions, and pitch-synchronous analysis on the performance of language identification systems. In addition to spectral features, the authors explore prosodic features such as intonation, rhythm, and stress features for discriminating the languages. They present how the proposed spectral and prosodic features capture the language specific information from two complementary aspects, showing how the development of language identification (LID) system using the combination of spectral and prosodic...
This book discusses the impact of spectral features extracted from frame level, glottal closure regions, and pitch-synchronous analysis on the perform...
This book discusses the contribution of excitation source information in discriminating language. The authors focus on the excitation source component of speech for enhancement of language identification (LID) performance. Language specific features are extracted using two different modes: (i) Implicit processing of linear prediction (LP) residual and (ii) Explicit parameterization of linear prediction residual. The book discusses how in implicit processing approach, excitation source features are derived from LP residual, Hilbert envelope (magnitude) of LP residual and Phase of LP residual;...
This book discusses the contribution of excitation source information in discriminating language. The authors focus on the excitation source component...
This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) performance. Speech recognition is analyzed for read, extempore, and conversation modes of speech. Five groups of articulatory features (AFs) are explored for speech recognition, in addition to conventional spectral features. Each chapter provides the motivation for exploring the specific feature for SR...
This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation ...