This course offers an integrative exploration of how speech, gesture, and sign jointly shape meaning, cognition, and interaction across human and artificial agents (including language models, virtual avatars, and robots). The course equips participants with conceptual frameworks and practical tools to analyze multimodal communication, from spoken language and co-speech gesture to sign languages and other embodied signals, and to understand how these modalities interface with cognitive processes in real-world contexts.
Through a combination of theory, empirical research, and hands-on analysis, you will learn frameworks for describing multimodal structure and function, as well as methods for capturing and interpreting multimodal data in discourse. You will gain practical experience with publicly available tools such as MULTIDATA (an online, AI-based pipeline for studying multimodal communication), ELAN (a multimodal annotation tool), Whisper (an AI-based transcription tool), and state-of-the-art kinematic methods. Emphasis is placed on applying these tools in research, education, and professional contexts.
Learning objectives
- Understand how speech, gesture, and sign jointly contribute to meaning, cognition, and interaction in human and artificial systems.
- Use theoretical frameworks to analyze the structure and function of multimodal communication in discourse.
- Collect, annotate, and interpret multimodal data using publicly available tools for studying multimodal communication.