ISBN-13: 9783639243598 / Angielski / Miękka / 2010 / 132 str.
This work deals with automatic Dialogue Act (DA) recognition in Czech and in French. The first main contribution of this work is to propose and compare several approaches that recognize dialogue acts based on three types of information: lexical, prosodic and word positions. These approaches are tested on the Czech Railways dialogue act corpus. The experimental results confirm that every type of feature bring relevant and somewhat complementary information. The proposed methods that take into account word positions are especially interesting, as they bring global information about the structure of a sentence. One of the main issue in the domain of automatic dialogue act recognition concerns the design of a fast and cheap method to label new corpora. The next main contribution is to apply a general semi-supervised training approach based on the Expectation Maximization algorithm to the task of labeling a new corpus with pre-defined DAs. We further propose to filter out incorrect examples with two confidence measures methods. Experimental results show that the proposed method is an interesting approach to create new dialogue act corpora at low costs.