Facial expression recognition and intensity estimation.
Ekundayo, Olufisayo Sunday.
MetadataShow full item record
Facial Expression is one of the profound non-verbal channels through which human emotion state is inferred from the deformation or movement of face components when facial muscles are activated. Facial Expression Recognition (FER) is one of the relevant research fields in Computer Vision (CV) and Human-Computer Interraction (HCI). Its application is not limited to: robotics, game, medical, education, security and marketing. FER consists of a wealth of information. Categorising the information into primary emotion states only limit its performance. This thesis considers investigating an approach that simultaneously predicts the emotional state of facial expression images and the corresponding degree of intensity. The task also extends to resolving FER ambiguous nature and annotation inconsistencies with a label distribution learning method that considers correlation among data. We first proposed a multi-label approach for FER and its intensity estimation using advanced machine learning techniques. According to our findings, this approach has not been considered for emotion and intensity estimation in the field before. The approach used problem transformation to present FER as a multilabel task, such that every facial expression image has unique emotion information alongside the corresponding degree of intensity at which the emotion is displayed. A Convolutional Neural Network (CNN) with a sigmoid function at the final layer is the classifier for the model. The model termed ML-CNN (Multilabel Convolutional Neural Network) successfully achieve concurrent prediction of emotion and intensity estimation. ML-CNN prediction is challenged with overfitting and intraclass and interclass variations. We employ Visual Geometric Graphics-16 (VGG-16) pretrained network to resolve the overfitting challenge and the aggregation of island loss and binary cross-entropy loss to minimise the effect of intraclass and interclass variations. The enhanced ML-CNN model shows promising results and outstanding performance than other standard multilabel algorithms. Finally, we approach data annotation inconsistency and ambiguity in FER data using isomap manifold learning with Graph Convolutional Networks (GCN). The GCN uses the distance along the isomap manifold as the edge weight, which appropriately models the similarity between adjacent nodes for emotion predictions. The proposed method produces a promising result in comparison with the state-of-the-art methods.