An efficient malware detection approach with feature weighting based on Harris Hawks optimization


Alzubi O. A. , Alzubi J. A. , Al-Zoubi A. M. , Hassonah M. A. , KÖSE U.

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, vol.25, no.4, pp.2369-2387, 2022 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 25 Issue: 4
  • Publication Date: 2022
  • Doi Number: 10.1007/s10586-021-03459-1
  • Journal Name: CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC
  • Page Numbers: pp.2369-2387
  • Keywords: Machine learning, Security, Android malware detection, Harris Hawks optimization, Support vector machine, Feature weighting
  • Süleyman Demirel University Affiliated: Yes

Abstract

This paper introduces and tests a novel machine learning approach to detect Android malware. The proposed approach is composed of Support Vector Machine (SVM) classifier and Harris Hawks Optimization (HHO) algorithm. More specifically, the role of HHO algorithm is to optimize SVM classifier hyperparameters while the SVM performs the classification of malware based on the best-chosen model, as well as producing the optimal solution for weighting the features. The effectiveness of the proposed approach and the ability to increase detection performance are demonstrated by scientific testing using CICMalAnal2017 sampled datasets. We test our method and its robustness on five sampled datasets and achieved the best results in most datasets and measures when compared with other approaches. We also illustrate the ability of the proposed approach to measure the significance of each feature. In addition, we provide deep analysis of possible relationships between weighted features and the type of malware attack. The results show that the proposed approach outperforms the other metaheuristic algorithms and state-of-art classifiers.