While human listeners have little problems in dealing with the strong variation in spoken language, the same cannot be said about automatic speech recognition (ASR). This work compares recognition performance of man and machine with the aim of learning from the distinct errors between these two. Based on the differences, the signal processing mechanisms are analyzed that are suitable to increase the robustness of ASR. The comparison focuses on the influence of intrinsic variation of speech, i.e., changes in speaking rate, effort and style, as well as dialect and accent. The outcome of the...
While human listeners have little problems in dealing with the strong variation in spoken language, the same cannot be said about automatic speech rec...