THESIS
2020
xvii, 123 pages : illustrations ; 30 cm
Abstract
Analyzing human behaviors in videos has great value for various applications, such
as education, communication, sports, and surveillance. For example, analyzing students’
engagement in classroom videos can help teachers improve their teaching and analyzing
speakers’ presentation skills in public speech videos can better facilitate presentation
skills training. However, it is very time-consuming to manually digest and analyze human
behaviors in videos, especially when users need to conduct detailed analysis, such
as dynamic behavior comparison and behavior evolution exploration. Therefore, recent
research has proposed automated video analysis techniques to facilitate this process, such
as face detection, emotion recognition, pose estimation and action recognition. Although
they...[
Read more ]
Analyzing human behaviors in videos has great value for various applications, such
as education, communication, sports, and surveillance. For example, analyzing students’
engagement in classroom videos can help teachers improve their teaching and analyzing
speakers’ presentation skills in public speech videos can better facilitate presentation
skills training. However, it is very time-consuming to manually digest and analyze human
behaviors in videos, especially when users need to conduct detailed analysis, such
as dynamic behavior comparison and behavior evolution exploration. Therefore, recent
research has proposed automated video analysis techniques to facilitate this process, such
as face detection, emotion recognition, pose estimation and action recognition. Although
they have demonstrated promising performances in extracting human behaviors, in the
real world they are insufficient to support detailed analysis with various analytical tasks.
To this end, visual analytics has been applied to effectively analyze huge information
spaces, support data exploration and facilitate decision-making, which sheds light on
helping users interactively explore and analyze video data.
In this thesis, we propose three novel interactive visual analytics systems that combine automated video analysis techniques with human-centered visualizations to help users
explore and analyze video data. In our first work, we propose EmotionCues, a visual analytics
system that integrates emotion recognition algorithms with visualizations to easily
analyze classroom videos from the perspective of emotion summary and detailed analysis.
In particular, the system supports the visual analysis of classroom videos on two
different levels of granularity, namely, the overall emotion evolution patterns of all the
people involved, and the detailed visualization of an individual’s emotions. In the second
work, considering the multi-modality of video data, we propose EmoCo, an interactive
visual analytics system to facilitate the fine-grained analysis of emotion coherence across
face, text, and audio modalities in presentation videos. By developing suitable interactive
visualizations enhanced with new features, the system allows users to conduct the
in-depth exploration of emotions on three levels of detail (i.e., video, sentence, and word
level). In the third work, we focus on visualizing hand movement in videos and propose
GestureLens, a visual analytics system to help users explore and analyze gesture usage in
presentation videos. It enables users to gain a quick spatial and temporal overview of
gestures, as well as to conduct both content-based and gesture-based explorations. Both
real-world case studies and feedback from the collaboration domain experts verify the
effectiveness and usefulness of all the proposed systems.
Post a Comment