Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Drift Propensity Detection Technique to Improve the Performance for Cross-Version Software Defect Prediction
City University of Hong Kong, HKG.
City University of Hong Kong, HKG.
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.ORCID iD: 0000-0001-9140-9271
City University of Hong Kong, HKG.
2020 (English)In: Proceedings - 2020 IEEE 44th Annual Computers, Software, and Applications Conference, COMPSAC 2020 / [ed] Chan W.K.,Claycomb B.,Takakura H.,Yang J.-J.,Teranishi Y.,Towey D.,Segura S.,Shahriar H.,Reisman S.,Ahamed S.I., Institute of Electrical and Electronics Engineers Inc. , 2020, p. 882-891, article id 9202527Conference paper, Published paper (Refereed)
Abstract [en]

In cross-version defect prediction (CVDP), historical data is derived from the prior version of the same project to predict defects of the current version. Recent studies in CVDP focus on subset selection to deal with the changes of the data distributions. No prior study has focused on training data arriving in streaming fashion across the versions where the significant differences between versions make the prediction unreliable. We refer to this situation as Drift Propensity (DP). By identifying DP, necessary steps can be taken (e.g., updating or retraining the model) to improve the prediction performance. In this paper, we investigate the chronological defect datasets and identify DP in the datasets. The no-memory data management technique is employed to manage the data distributions and a DP detection technique is proposed. The idea behind the proposed DP detection technique is to monitor the algorithm's error-rate. The DP detector triggers DP, warning, and control flags to take necessary steps. The proposed technique is significantly superior in identifying the distribution differences (p-value < 0.05). The DP's identified in the data distributions achieve large effect sizes (Hedges' g ≥ 0.80) during the pair-wise comparisons. We observe that if the error-rate exponentially increases, it causes DP, resulting in prediction performance deterioration. We thus recommend researches and practitioners to address DP in the chronological datasets. Due to its potential effects in the datasets, the prediction models could be enhanced to get the best results in CVDP. © 2020 IEEE.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc. , 2020. p. 882-891, article id 9202527
Keywords [en]
cross-version defect prediction, drift propensity, software defect prediction, streaming data, two-window-based data distributions, Application programs, Defects, Deterioration, Forecasting, Information management, Data distribution, Data management techniques, Defect prediction, Pair-wise comparison, Potential effects, Prediction performance, Subset selection, Predictive analytics
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-20668DOI: 10.1109/COMPSAC48688.2020.0-154ISI: 000629086600115Scopus ID: 2-s2.0-85094120485ISBN: 9781728173030 (print)OAI: oai:DiVA.org:bth-20668DiVA, id: diva2:1498870
Conference
44th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2020, Virtual, Madrid, Spain; 13 July 2020 through 17 July 2020
Available from: 2020-11-05 Created: 2020-11-05 Last updated: 2021-04-30Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Bennin, Kwabena Ebo

Search in DiVA

By author/editor
Bennin, Kwabena Ebo
By organisation
Department of Software Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 75 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf