Research Article Open Access

Multimodal Face Expression Recognition Using Parametric Exponential Linear Unit-Long Short-Term Memory

Kampa Ratna Babu1, Akula Suneetha 2 and Kampa Kanthi Kumar3
  • 1 Department of Computer Engineering, M.B.T.S. Government Polytechnic, Guntur, India
  • 2 Department of Computer Science and Engineering, KKR and KSR Institute of Technology and Sciences, Guntur, India
  • 3 Department of Electronics and Communication Engineering, Tirumala Engineering College, Narasaraopet, India

Abstract

Multimodal facial expression recognition combines information from multiple modalities of audio and video to achieve the required accuracy and robustness. By integrating different data sources, multimodal systems capture different aspects of human expression. However, accurately recognizing facial expressions across audio and video modalities causes challenges due to variations in expression representation. In this research, Parametric Exponential Linear Unit-Long Short-Term Memory (PELU-LSTM) is proposed to accurately recognize multimodal facial expressions. Initially, the SAVEE dataset is used to evaluate the performance of the proposed method which contains audio and video frames. In audio pre-processing, a wiener filter is deployed to minimize background noise, while a Gaussian Weighting Function (GWF) is employed to aggregate the entire video into a smaller number of frames which also minimizes the information loss. The Mel-Frequency Cepstral Coefficient (MFCC) is utilized to extract audio features, while the Histogram of Gradient (HOG) and Local Binary Pattern (LBP) are employed for extracting the video features. Then, concatenation is performed to fuse a single feature vector. Finally, PELU-LSTM recognizes the facial emotional expressions accurately. The proposed technique achieves a high accuracy of 99.75%, as compared to the existing techniques like Bi-directional LSTM-Convolution Neural Networks (Bi-LSTM-CNN), attention-based 2D CNN with LSTM and K-means clustering-based Kernel Canonical Correlation Analysis (KMKCCA)

Journal of Computer Science
Volume 20 No. 10, 2024, 1339-1348

DOI: https://doi.org/10.3844/jcssp.2024.1339.1348

Submitted On: 17 May 2024 Published On: 17 August 2024

How to Cite: Babu, K. R., Suneetha , A. & Kumar, K. K. (2024). Multimodal Face Expression Recognition Using Parametric Exponential Linear Unit-Long Short-Term Memory. Journal of Computer Science, 20(10), 1339-1348. https://doi.org/10.3844/jcssp.2024.1339.1348

  • 398 Views
  • 197 Downloads
  • 0 Citations

Download

Keywords

  • Gaussian Weighting Function
  • Histogram of Gradient
  • Long Short-Term Memory
  • Parametric Exponential Linear Unit
  • Wiener Filter