Future pose: real-time human motion forecasting using a RGB camera

We propose a novel mixed reality martial arts training system using deep learning based real-time hu- man pose forecasting.

Our training system is based on 3D pose estimation us- ing a residual neural network with input from a RGB cam- era, which captures the motion of a trainer. The student wearing a head mounted display can see the virtual model of the trainer and his forecasted future pose. The pose forecasting is based on recurrent networks, to improve the learning quantity of the motion’s temporal feature, we use a special lattice optical flow method for the joints movement estimation. We visualize the real-time human motion by a generated human model while the forecasted pose is shown by a red skeleton model. In our experiments, we evaluated the performance of our system when predicting 15 frames ahead in a 30-fps video (0.5s forecasting), the accuracies were acceptable since they are equal to or even outperforms some methods using depth IR cameras or fabric technolo- gies, user studies showed that our system is helpful for be- ginners to understand martial arts and the usability is com- fortable since the motions were captured by RGB camera.


  • Erwin Wu, Hideki Koike: FuturePose - Mixed Reality matrial Arts Training Using Real-Time 3D Human Pose Forecasting With a RGB Camera, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp.1384-1392, 2019. (PDF)
  • Erwin Wu, Hideki Koike: Real-time Human Motion Forecasting using a RGB Camera, The First IEEE Workshop on Human Augmentation and Its Applications (HAA2019), 2019. to be published.