Head pose healthiness prediction using a novel image quality based stacked autoencoder

Nejkovic V., ÖZTÜRK M. M., Petrovic N.

Digital Signal Processing: A Review Journal, vol.130, 2022 (SCI-Expanded) identifier

  • Publication Type: Article / Article
  • Volume: 130
  • Publication Date: 2022
  • Doi Number: 10.1016/j.dsp.2022.103696
  • Journal Name: Digital Signal Processing: A Review Journal
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Compendex, Computer & Applied Sciences, INSPEC
  • Keywords: Head pose healthiness prediction, Image quality, Ontology, Pose estimation, Stacked autoencoder
  • Süleyman Demirel University Affiliated: Yes


© 2022 Elsevier Inc.This paper introduces an approach aiming to determine head pose healthiness of computer users. The main contributions of this paper are: 1) Image Quality Assessment (IQA) based Stacked Autoencoder (referred to as IQASAE) which adjusts the value of learning rate based on the quality of images; 2) Head Pose Healthiness Prediction (HPHP) framework which leverages the proposed IQASAE algorithm in combination with image processing operations; 3) A set of features suitable for face analysis applications; 4) Ontology-driven semantic framework which enables further exploiting pose estimation results within applications in synergy with healthcare expert domain knowledge about pose healthiness. Our framework was evaluated on both offline (BIWI and AFLW) and online (our own, collected using Arduino) datasets. Furthermore, it was compared to several state-of-art methods, including Multi-Layer Perceptron (MLP), CART, Random Forest, Convolutional Neural Networks (CNN), Temporal Deep Learning Model (TDLM), hybrid CNN with Support Vector Machine (SVM), Quatnet and Trinet. According to the achieved experimental results, it reaches accuracy up to 79.63% outperforming all of them, except Quatnet and Trinet. However, the main advantages of IQASAE compared to state-of-art methods are: 1) it does not require selection of features, so the processing time is reduced, 2) utilizing angle between chin and mouth reduces training time for SAE, 3) leveraging vector-based feature set to create training data resulted in a significant improvement, especially in offline facial images.