The increased use of video documents for multimedia-based applications has created a demand for strong video database support, including efficient methods for browsing, retrieving and classifying video data. Most solutions rely on visual information only, ignoring the rich source of the accompanying audio signal and texts. The semantic gap between mental and computational world is not yet surmounted by current algorithms and systems. Hence, there is a need for much research effort. Speech is the significant information that has a close connection to video contents. The closed caption text...
The increased use of video documents for multimedia-based applications has created a demand for strong video database support, including efficient met...