Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Information Visualization and Machine Learning Applied on Static Code Analysis
Blekinge Institute of Technology, School of Engineering, Department of Systems and Software Engineering.
Blekinge Institute of Technology, School of Engineering, Department of Systems and Software Engineering.
2008 (English)Independent thesis Advanced level (degree of Master (Two Years))Student thesis
Abstract [en]

Software engineers will possibly never see the perfect source code in their lifetime, but they are seeing much better analysis tools for finding defects in software. The approaches used in static code analysis emerged from simple code crawling to usage of statistical and probabilistic frameworks. This work presents a new technique that incorporates machine learning and information visualization into static code analysis. The technique learns patterns in a program’s source code using a normalized compression distance and applies them to classify code fragments into faulty or correct. Since the classification frequently is not perfect, the training process plays an essential role. A visualization element is used in the hope that it lets the user better understand the inner state of the classifier making the learning process transparent. An experimental evaluation is carried out in order to prove the efficacy of an implementation of the technique, the Code Distance Visualizer. The outcome of the evaluation indicates that the proposed technique is reasonably effective in learning to differentiate between faulty and correct code fragments, and the visualization element enables the user to discern when the tool is correct in its output and when it is not, and to take corrective action (further training or retraining) interactively, until the desired level of performance is reached.

Place, publisher, year, edition, pages
2008. , p. 43
Keywords [en]
Software validation, static analyzer, normalized compression distance, source code visualization
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-3033Local ID: oai:bth.se:arkivexA45EAD63A287CE18C1257476002D87FCOAI: oai:DiVA.org:bth-3033DiVA, id: diva2:830330
Uppsok
Technology
Supervisors
Available from: 2015-04-22 Created: 2008-06-28 Last updated: 2018-01-11Bibliographically approved

Open Access in DiVA

fulltext(1137 kB)218 downloads
File information
File name FULLTEXT01.pdfFile size 1137 kBChecksum SHA-512
b7508d55b248d4b0fb7aea9b402210b988e7f1b0a4fe2ba3f9aec1bb08de5295487629521be32f8cc948d4d069882a68b6e9749cb91dee024d7f064a72b79d13
Type fulltextMimetype application/pdf

By organisation
Department of Systems and Software Engineering
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 218 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 2623 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf