COMPLEXFUZZY: NOVEL CLUSTERING METHOD FOR SELECTING TRAINING INSTANCES OF CROSS-PROJECT DEFECT PREDICTION


ÖZTÜRK M. M.

COMPUTER SCIENCE-AGH, cilt.22, sa.1, ss.3-37, 2021 (ESCI İndekslerine Giren Dergi) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 22 Konu: 1
  • Basım Tarihi: 2021
  • Doi Numarası: 10.7494/csci.2021.22.1.3743
  • Dergi Adı: COMPUTER SCIENCE-AGH
  • Sayfa Sayıları: ss.3-37

Özet

Over the last decade, researchers have investigated to what extent cross-project defect prediction (CPDP) shows advantages over traditional defect prediction settings. These works do not take the training and testing data of defect prediction from the same project; instead, dissimilar projects are employed. Selecting the proper training data plays an important role in terms of the success of CPDP. In this study, a. novel clustering method called complexFuzzy is presented for selecting the training data of CPDP. The method reveals the most defective instances that the experimental predictors exploit in order to complete the training. To that end, a fuzzy-based membership is constructed on the data sets. Hence, overfitting (which is a crucial problem in CPDP training) is alleviated. The performance of complexFuzzy is compared to its 5 counterparts on 29 data sets by utilizing 4 classifiers. According to the obtained results, complexFuzzy is superior to other clustering methods in CPDP performance.