Modeling Human Communication Dynamics

NIPS Workshop

Friday, December 10th, 2010

Whistler, British Columbia, Canada

Location: Westin hotel, Alpine A


Face-to-face communication is a highly interactive process in which the participants mutually exchange and interpret verbal and nonverbal messages. Both the interpersonal dynamics and the dynamic interactions among an individual's perceptual, cognitive, and motor processes are swift and complex. How people accomplish these feats of coordination is a question of great scientific interest. Models of human communication dynamics also have much potential practical value, for applications including the understanding of communications problems such as autism and the creation of socially intelligent robots able to recognize, predict, and analyze verbal and nonverbal behaviors in real-time interaction with humans.

Modeling human communicative dynamics brings exciting new problems and challenges to the NIPS community.  The first goal of this workshop is to raise awareness in the machine learning community of these problems, including some applications needs, the special properties of these input streams, and the modeling challenges.  The second goal is to exchange information about methods, techniques, and algorithms suitable for modeling human communication dynamics. After the workshop, depending on interest, we may arrange to publish full-paper versions of selected submissions, possibly as a volume in the JMLR Workshop and Conference papers series.


We therefore invite submissions of short high-quality papers describing research on Human Communication Dynamics and related topics.  Suitable themes include, but are not limited to:



7:30    Welcome

7:35    Invited talk: Dan Bohus, Microsoft Research

8:15    Invited talk: Marian Stewart Bartlett, University of California, San Diego

8:55    break

9:15    Teaser talks for posters

9:40    Invited talk: Jeff Bilmes, University of Washington

10:20 General discussion


10:30-15:30  lunch/poster preview


15:30 Invited talk: Noah Goodman, Stanford University

16:10 Poster session (12 posters, see below)

17:10 break

17:30 Invited talk: Justine Cassell, Carnegie Mellon University

18:10 General discussion

18:30 Adjourn


Invited Speakers

Dan Bohus, Microsoft Research

Title: Models for Multiparty Turn Taking in Situated Dialog

In this talk I will describe recent work on a computational framework for managing the turn-taking process in situated, multiparty dialog. The approach harnesses component models that leverage audio-visual evidence to track the multiparty conversational dynamics, to make floor control decisions, and to render these decisions into a set of coordinated verbal and non-verbal behaviors. I will review experiments which demonstrate how the approach enables an embodied conversational agent to participate in and shape flow of multiparty dialog. Finally, I will discuss lessons learned from these experiments and highlight future challenges in this space. 


Marian Stewart Bartlett, University of California, San Diego

Title: Modeling Natural Facial Behavior

This talk reviews recent research in my lab modeling natural facial expression with automated systems. Automated systems enable new research into expression dynamics that was previously infeasible with manual coding, or which would have required application of electrodes to the face, which can influence facial behavior. The talk first describes projects on measurement of dynamic coupling of facial behavior to measure spontaneous mimicry, as well as detection of deception. We show that facial mimicry correlates with the ability to detect when a person is lying. This had long been hypothesized by embodied theories of cognition but never previously shown.  These findings were made possible by the use of novel computer vision techniques that allowed us to obtain rich quantitative information about facial dynamics. The talk next describes development of interventions for children with autism.  The interventions employ computer vision systems to train facial expression production, provide practice in facial mimicry, and immediate feedback on the child’s facial expressions. Finally, if time permits, I will review our work on children’s facial behavior during problem solving.  Clustering techniques are employed to demonstrate differences in expression dynamics between older and younger children during problem solving.


Jeff Bilmes, University of Washington

Title: Social Speech Recognition

Humans are social beings and interact with and influence each other in subtle and complex ways, one of the most important of those being spoken discourse. While conversations between individuals might start out clumsy and as a result cognitively taxing, over time, they often become easier to ascertain because of an alignment process that eventually occurs. This process allows participants to more efficiently meet their communicative goals. Most large vocabulary speech recognition systems operate on speech corpora that originated from conversations between individuals. In the implementation of such systems, however, each side of a conversation is excised and processed independently from its original context. This eliminates any benefit that might be available from the cooperative nature of discourse.  In this work, a new approach is suggested for speech recognition where multiple sides of a conversation in a dialog or meeting are processed and decoded jointly rather than independently. We introduce a practical initial implementation of this approach that demonstrates both language model perplexity and speech recognition word error rate improvements in conversational telephone speech.


Noah Goodman, Stanford University

Title: Communicative inference as social cognition

I will describe an approach to capturing inferential aspects of human communication within a probabilistic framework for social cognition. I will apply this framework to a set of empirical results on pragmatics, natural pedagogy, and word learning.


Justine Cassell, Carnegie-Mellon University

Title: Regulative or Constitutive Behaviors: Culture and Identity in Human Interaction

The phrase 'human communication dynamics' implies that the interplay between two or more people is essential to human communication. All too often, however, our analyses of human communication rely on snapshots from on-high of a single person's behavior, from which we infer communicative intention and dynamics among interlocutors. Taking examples from my recent research on face-to-face and online communication, I will try to highlight the dynamic and contingent nature of human communication and the ways in which we can take the interlocutor's perspective in order to better understanding the dynamics of human communication.


Accepted Papers:



Louis-Philippe Morency (University of Southern California)
Daniel Gatica-Perez (IDIAP)
Nigel Ward (UTEP)

Previous conferences:

Predictive Models of Human Communication Dynamics, Los Angeles, August 4-6, 2010. Co-organized by Louis-Philippe Morency and Nigel Ward



PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning    Description: