Contributions into holistic human action recognition.
Toudjeu, Tchangou Ignance.
MetadataShow full item record
In this thesis we holistically investigate the interpretation of human actions in both still images and videos. Human action recognition is currently a research problem of great interest both in academia and industry due to its potential applications which include security surveillances, sports annotation, human-computer interaction, and robotics. Action recognition, being a process of labelling actions using sensory observations, can be deﬁned as a sequence of movements engendered by a human during an executed task. Such a process, when considering visual observations, is quite challenging and faces issues such as background clutter, shadows, illumination variations, occlusions, changes in scale, changes in the person performing the action, and viewpoint variations. Although many approaches to development of human action recognition systems have been proposed in the literature, they focused more on recognition accuracy while ignoring the computational complexity accompanying the recognition process. However, a human action recognition system which is both eﬀective and eﬃcient and can be operated real-time is needed. Firstly, we review, evaluate and compare the most prominent state-of-the-art feature extraction representations categorized between handcrafted feature based techniques and deep learning feature based techniques. Secondly, we propose holistic approaches in each of the categories. The ﬁrst holistic approach takes advantage of existing slope patterns in the motion history images, which are a simple two dimensional representation of video, and reduces the running time of action recognition. The second one based on circular derivative local binary patterns outperforms the LBP based state-of-the-art techniques and addresses the issues of dimensionality by producing feature descriptor with minimal dimension size with less compromise on the recognition accuracy. The third one introduces a preprocessing step in a proposed 2D-convolutional neural network to deal with the same issue of dimensionality diﬀerently in the deep learning techniques. Here the temporal dimension is embedded into motion history images before being learned by a two dimensional convolutional neural network. Thirdly, three datasets (JAFFE, KTH and Pedestrian Action dataset) were used to validate the proposed human action recognition models. Finally, we show that better performance in comparison to the state-of-the-art methods can be achieved using holistic feature based techniques.