Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improving Ranking-Oriented Defect Prediction Using a Cost-Sensitive Ranking SVM
Wuhan University, CHN; .
City University of Hong Kong, HKG.
Hong Kong Polytechnic University, HKG.
Hong Kong Polytechnic University, HKG.
Show others and affiliations
2020 (English)In: IEEE Transactions on Reliability, ISSN 0018-9529, E-ISSN 1558-1721, Vol. 69, no 1, p. 139-153, article id 8809898Article in journal (Refereed) Published
Abstract [en]

Context: Ranking-oriented defect prediction (RODP) ranks software modules to allocate limited testing resources to each module according to the predicted number of defects. Most RODP methods overlook that ranking a module with more defects incorrectly makes it difficult to successfully find all of the defects in the module due to fewer testing resources being allocated to the module, which results in much higher costs than incorrectly ranking the modules with fewer defects, and the numbers of defects in software modules are highly imbalanced in defective software datasets. Cost-sensitive learning is an effective technique in handling the cost issue and data imbalance problem for software defect prediction. However, the effectiveness of cost-sensitive learning has not been investigated in RODP models. Aims: In this article, we propose a cost-sensitive ranking support vector machine (SVM) (CSRankSVM) algorithm to improve the performance of RODP models. Method: CSRankSVM modifies the loss function of the ranking SVM algorithm by adding two penalty parameters to address both the cost issue and the data imbalance problem. Additionally, the loss function of the CSRankSVM is optimized using a genetic algorithm. Results: The experimental results for 11 project datasets with 41 releases show that CSRankSVM achieves 1.12%-15.68% higher average fault percentile average (FPA) values than the five existing RODP methods (i.e., decision tree regression, linear regression, Bayesian ridge regression, ranking SVM, and learning-to-rank (LTR)) and 1.08%-15.74% higher average FPA values than the four data imbalance learning methods (i.e., random undersampling and a synthetic minority oversampling technique; two data resampling methods; RankBoost, an ensemble learning method; IRSVM, a CSRankSVM method for information retrieval). Conclusion: CSRankSVM is capable of handling the cost issue and data imbalance problem in RODP methods and achieves better performance. Therefore, CSRankSVM is recommended as an effective method for RODP. © 1963-2012 IEEE.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc. , 2020. Vol. 69, no 1, p. 139-153, article id 8809898
Keywords [en]
Cost-sensitive learning, data imbalance, ranking-oriented defect prediction (RODP), Decision trees, Defects, Forecasting, Genetic algorithms, Learning systems, Regression analysis, Software testing, Support vector machines, Trees (mathematics), Decision tree regression, Defect prediction, Fault percentile averages, Random under samplings, Ranking support vector machines (SVM), Software defect prediction, Synthetic minority over-sampling techniques, Learning to rank
National Category
Software Engineering Computer Sciences
Identifiers
URN: urn:nbn:se:bth-19344DOI: 10.1109/TR.2019.2931559ISI: 000526289100010Scopus ID: 2-s2.0-85081690027OAI: oai:DiVA.org:bth-19344DiVA, id: diva2:1420490
Available from: 2020-03-31 Created: 2020-03-31 Last updated: 2020-04-30Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Bennin, Kwabena Ebo

Search in DiVA

By author/editor
Bennin, Kwabena Ebo
By organisation
Department of Software Engineering
In the same journal
IEEE Transactions on Reliability
Software EngineeringComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 53 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf