Bringing together experts in multimodal signal processing, this book provides a detailed introduction to the area, with a focus on the analysis, recognition and interpretation of human communication. The technology described has powerful applications. For instance, automatic analysis of the outputs of cameras and microphones in a meeting can make sense of what is happening – who spoke, what they said, whether there was an active discussion and who was dominant in it. These analyses are layered to move from basic interpretations of the signals to richer semantic information. The book covers the necessary analyses in a tutorial manner, going from basic ideas to recent research results. It includes chapters on advanced speech processing and computer vision technologies, language understanding, interaction modeling and abstraction, as well as meeting support technology. This guide connects fundamental research with a wide range of prototype applications to support and analyze group interactions in meetings.
{{comment.content}}