TY - GEN
T1 - An immersive system with multi-modal human-computer interaction
AU - Zhao, Rui
AU - Wang, Kang
AU - Divekar, Rahul
AU - Rouhani, Robert
AU - Su, Hui
AU - Ji, Qiang
PY - 2018
Y1 - 2018
N2 - We introduce an immersive system prototype that integrates face, gesture and speech recognition techniques to support multi-modal human-computer interaction capability. Embedded in an indoor room setting, a multi-camera system is developed to monitor the user facial behavior, body gesture and spatial location in the room. A server that fuses different sensor inputs in a time-sensitive manner so that our system knows who is doing what at where in real-time. When correlating with speech input, the system can better understand the user intention for interaction purpose. We evaluate the performance of core recognition techniques on both benchmark and selfcollected datasets and demonstrate the benefit of the system in various use cases.
AB - We introduce an immersive system prototype that integrates face, gesture and speech recognition techniques to support multi-modal human-computer interaction capability. Embedded in an indoor room setting, a multi-camera system is developed to monitor the user facial behavior, body gesture and spatial location in the room. A server that fuses different sensor inputs in a time-sensitive manner so that our system knows who is doing what at where in real-time. When correlating with speech input, the system can better understand the user intention for interaction purpose. We evaluate the performance of core recognition techniques on both benchmark and selfcollected datasets and demonstrate the benefit of the system in various use cases.
UR - https://dx.doi.org/10.1109/FG.2018.00083
U2 - 10.1109/fg.2018.00083
DO - 10.1109/fg.2018.00083
M3 - Conference contribution
BT - 13th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2018
ER -