Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Self-learning algorithms applied in Continuous Integration system
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.
2018 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Context: Continuous Integration (CI) is a software development practice where a developer integrates a code into the shared repository. And, then an automated system verifies the code and runs automated test cases to find integration error. For this research, Ericsson’s CI system is used. The tests that are performed in CI are regression tests. Based on the time scopes, the regression test suites are categorized into hourly and daily test suits. The hourly test is performed on all the commits made in a day, whereas the daily test is performed at night on the latest build that passed the hourly test. Here, the hourly and daily test suites are static, and the hourly test suite is a subset of the daily test suite. Since the daily test is performed at the end of the day, the results are obtained on the next day, which is delaying the feedback to the developers regarding the integration errors. To mitigate this problem, research is performed to find the possibility of creating a learning model and integrating into the CI system, which can then create a dynamic hourly test suite for faster feedback.

Objectives: This research aims to find the suitable machine learning algorithm for CI system and investigate the feasibility of creating self-learning test machinery. This goal is achieved by examining the CI system and, finding out what type data is required for creating the learning model for prioritizing the test cases. Once the necessary data is obtained, then the selected algorithms are evaluated to find the suitable learning algorithm for creating self-learning test machinery. And then, the investigation is done whether the created learning model can be integrated into the CI workflow to create the self-learning test machinery.

Methods: In this research, an experiment is conducted for evaluating the learning algorithms. For this experimentation, the data is provided by Ericsson AB, Gothenburg. The dataset consists of the daily test information and the test case results. The algorithms that are evaluated in this experiment are Naïve Bayes, Support vector machines, and Decision trees. This evaluation is done by performing leave-one-out cross-validation. And, the learning algorithm performance is calculated by using the prediction accuracy. After obtaining the accuracies, the algorithms are compared to find the suitable machine learning algorithm for CI system.

Results: Based on the Experiment results it is found that support vector machines have outperformed Naïve Bayes and Decision tree algorithms in performance. But, due to the challenges present in the current CI system, the created learning model is not feasible to integrate into the CI. The primary challenge faced by the CI system is, mapping of test case failure to its respective commit is no possible (cannot find which commit made the test case to fail). This is because the daily test is performed on the latest build which is the combination of commits made in that day. Another challenge present is low data storage. Due to this low data storage, problems like the curse of dimensionality and class imbalance has occurred.

Conclusions: By conducting this research, a suitable learning algorithm is identified for creating a self-learning machinery. And, also identified the challenges facing to integrate the model in CI. Based on the results obtained from the experiment, it is recognized that support vector machines have high prediction accuracy in test case result classification compared to Naïve Bayes and Decision trees.

Place, publisher, year, edition, pages
2018. , p. 30
Keywords [en]
Machine learning, Classification, Continuous Integration, Test case prioritization, Regression Testing
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:bth-16675OAI: oai:DiVA.org:bth-16675DiVA, id: diva2:1229394
External cooperation
Ericsson AB, Gothenburg, SE
Subject / course
DV2572 Master´s Thesis in Computer Science
Educational program
DVADA Master Qualification Plan in Computer Science
Presentation
2018-01-23, J1620, Blekinge Tekniska Högskola, Karlskrona, 13:00 (English)
Supervisors
Examiners
Available from: 2018-07-02 Created: 2018-06-29 Last updated: 2018-07-02Bibliographically approved

Open Access in DiVA

BTH2018Tummala(566 kB)538 downloads
File information
File name FULLTEXT02.pdfFile size 566 kBChecksum SHA-512
f266de5b59c2ac0f22d5346f9b88948484a32bc323203da8116a0cb71992c6ce0134fa670a40c57dd54f27a34e11483ad0b7b844e4058e58eefbb58d3fdfa528
Type fulltextMimetype application/pdf

By organisation
Department of Computer Science and Engineering
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 538 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 636 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf