A Multimodal Predictive Agent Model for Human Interaction Generation

Published on Jun 1, 2020 in CVPR (Computer Vision and Pattern Recognition)
· DOI :10.1109/CVPRW50498.2020.00519
Murchana Baruah1
Estimated H-index: 1
,
Bonny Banerjee , Bonny Banerjee11
Estimated H-index: 11
(U of M: University of Memphis)
Sources
Abstract
📖 Papers frequently viewed together
2008
1 Citations
2018
2007
References19
Newest
Monitoring using sensors is ubiquitous in our environment. In this paper, a state estimation model is proposed for continuous activity monitoring from multimodal and heterogenous sensor data. Each sensor is modeled as an independent agent in the predictive coding framework. It can sample its environment, communicate with other agents, and adapt its internal model to its environment in an unsupervised manner. Using controlled experiments, we show that limitations of each sensor, such as inference...
3 CitationsSource
#1Hsu-kuang Chiu (Stanford University)H-Index: 6
#2Ehsan Adeli (UNC: University of North Carolina at Chapel Hill)H-Index: 22
Last. Juan Carlos Niebles (Stanford University)H-Index: 31
view all 5 authors...
Forecasting human dynamics is a very interesting but challenging task with several prospective applications in robotics, health-care, among others. Researchers have recently developed methods for human pose forecasting; but unfortunately, they often introduce a number of simplification assumptions. For instance, previous work either focuses only on short-term or long-term predictions, while sacrificing one or the other. Furthermore, they use the activity labels as part of the training process an...
44 CitationsSource
Sep 8, 2018 in ECCV (European Conference on Computer Vision)
#1Liang-Yan Gui (CMU: Carnegie Mellon University)H-Index: 6
#2Yu-Xiong Wang (CMU: Carnegie Mellon University)H-Index: 19
Last. Jose M. F. Moura (CMU: Carnegie Mellon University)H-Index: 81
view all 4 authors...
We explore an approach to forecasting human motion in a few milliseconds given an input 3D skeleton sequence based on a recurrent encoder-decoder framework. Current approaches suffer from the problem of prediction discontinuities and may fail to predict human-like motion in longer time horizons due to error accumulation. We address these critical issues by incorporating local geometric structure constraints and regularizing predictions with plausible temporal smoothness and continuity from a glo...
101 CitationsSource
Jun 1, 2018 in CVPR (Computer Vision and Pattern Recognition)
#1Chen Li (NUS: National University of Singapore)H-Index: 4
#2Zhen Zhang (NUS: National University of Singapore)H-Index: 56
Last. Gim Hee Lee (NUS: National University of Singapore)H-Index: 30
view all 4 authors...
Human motion modeling is a classic problem in computer vision and graphics. Challenges in modeling human motion include high dimensional prediction as well as extremely complicated dynamics.We present a novel approach to human motion modeling based on convolutional neural networks (CNN). The hierarchical structure of CNN makes it capable of capturing both spatial and temporal correlations effectively. In our proposed approach, a convolutional long-term encoder is used to encode the whole given m...
103 CitationsSource
May 21, 2018 in ICRA (International Conference on Robotics and Automation)
#1Anirudh Vemula (CMU: Carnegie Mellon University)H-Index: 8
#2Katharina Muelling (CMU: Carnegie Mellon University)H-Index: 11
Last. Jean Oh (CMU: Carnegie Mellon University)H-Index: 17
view all 3 authors...
Robots that navigate through human crowds need to be able to plan safe, efficient, and human predictable trajectories. This is a particularly challenging problem as it requires the robot to predict future human trajectories within a crowd where everyone implicitly cooperates with each other to avoid collisions. Previous approaches to human trajectory prediction have modeled the interactions between humans as a function of proximity. However, that is not necessarily true as some people in our imm...
272 CitationsSource
May 21, 2018 in ICRA (International Conference on Robotics and Automation)
#1Judith Bütepage (KTH: Royal Institute of Technology)H-Index: 9
#2Hedvig Kjellström (KTH: Royal Institute of Technology)H-Index: 22
Last. Danica Kragic (KTH: Royal Institute of Technology)H-Index: 66
view all 3 authors...
Fluent and safe interactions of humans and robots require both partners to anticipate the others' actions. The bottleneck of most methods is the lack of an accurate model of natural human motion. In this work, we present a conditional variational autoencoder that is trained to predict a window of future human motion given a window of past frames. Using skeletal data obtained from RGB depth images, we show how this unsupervised approach can be used for online motion prediction for up to 1660 ms. ...
38 CitationsSource
#1Shamima Najnin (U of M: University of Memphis)H-Index: 7
#2Bonny Banerjee (U of M: University of Memphis)H-Index: 11
Abstract Predictive coding has been hypothesized as a universal principle guiding the operation in different brain areas. In this paper, a predictive coding framework for a developmental agent with perception (audio), action (vocalization), and learning capabilities is proposed. The agent learns concurrently to plan optimally and the associations between sensory and motor parameters, by minimizing the sensory prediction error in an unsupervised manner. The proposed agent is solely driven by sens...
11 CitationsSource
#1Partha GhoshH-Index: 6
#2Jie Song (ETH Zurich)H-Index: 10
Last. Otmar Hilliges (ETH Zurich)H-Index: 38
view all 4 authors...
We propose a new architecture for the learning of predictive spatio-temporal motion models from data alone. Our approach, dubbed the Dropout Autoencoder LSTM (DAELSTM), is capable of synthesizing natural looking motion sequences over long-time horizons1 without catastrophic drift or motion degradation. The model consists of two components, a 3-layer recurrent neural network to model temporal aspects and a novel autoencoder that is trained to implicitly recover the spatial structure of the human ...
102 CitationsSource
#1Jia Han (SUS: Shanghai University of Sport)H-Index: 11
#2Gordon Waddington (UC: University of Canberra)H-Index: 30
Last. Yu Liu (SUS: Shanghai University of Sport)H-Index: 21
view all 5 authors...
Abstract To control movement, the brain has to integrate proprioceptive information from a variety of mechanoreceptors. The role of proprioception in daily activities, exercise, and sports has been extensively investigated, using different techniques, yet the proprioceptive mechanisms underlying human movement control are still unclear. In the current work we have reviewed understanding of proprioception and the three testing methods: threshold to detection of passive motion, joint position repr...
176 CitationsSource
#1Jayanta K. Dutta (U of M: University of Memphis)H-Index: 7
#2Bonny BanerjeeH-Index: 11
Last. Chandan K. Reddy (WSU: Wayne State University)H-Index: 28
view all 3 authors...
Outlier detection has been an active area of research for a few decades. We propose a new definition of outlier that is useful for high-dimensional data. According to this definition, given a dictionary of atoms learned using the sparse coding objective, the outlierness of a data point depends jointly on two factors: the frequency of each atom in reconstructing all data points (or its negative log activity ratio, NLAR) and the strength by which it is used in reconstructing the current point. A R...
12 CitationsSource
Cited By2
Newest
Round-the-clock monitoring of human behavior and emotions is required in many healthcare applications which is very expensive but can be automated using machine learning (ML) and sensor technologies. Unfortunately, the lack of infrastructure for collection and sharing of such data is a bottleneck for ML research applied to healthcare. Our goal is to circumvent this bottleneck by simulating a human body in virtual environment. This will allow generation of potentially infinite amounts of shareabl...
1 Citations