Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Comparing state-of-the-art machine learning malware detection methods on Windows
2021 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Background. Malware has been a major issue for years and old signature scanning methods for detecting malware are outdated and can be bypassed by most advanced malware. With the help of machine learning, patterns of malware behavior and structure can be learned to detect the more advanced threats that are active today.

Objectives. In this thesis, research to find state-of-the-art machine learning methods to detect malware is proposed. A dataset collection method will be found in research to be used in an experiment. Three selected methods will be re-implemented for an experiment to compare which has the best performance. All three algorithms will be trained and tested on the same dataset.

Methods. A literature review with the snowballing technique was proposed to find the state-of-the-art detection methods. The malware was collected through the malware database VirusShare and the total number of samples was 14924. The algorithms were re-implemented, trained, tested, and compared by accuracy, true positive, true negative, false positive, and false negative.

Results. The results showed that the best performing research available are image detection, N-Gram combined with meta-data and Function Call Graphs. However, a new method was proposed called Running Window Entropy which does not have a lot of research about it and still can achieve decent accuracy. The selected methods for comparison were image detection, N-Gram, and Running Window Entropy where the results show they had an accuracy of 94.64%, 96.45%, and 93.71% respectively.

Conclusions. On this dataset, it showed that the N-Gram had the best performance of all three methods. The other two methods showed that, depending on the use case, either can be applicable. 

Place, publisher, year, edition, pages
2021.
Keywords [en]
Malware, Machine Learning, Static Analysis
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-21950OAI: oai:DiVA.org:bth-21950DiVA, id: diva2:1577650
Subject / course
DV2572 Master´s Thesis in Computer Science
Educational program
DVACO Master's program in computer science 120,0 hp
Supervisors
Examiners
Available from: 2021-07-05 Created: 2021-07-02 Last updated: 2021-07-05Bibliographically approved

Open Access in DiVA

Comparing state-of-the-art machine learning malware detection methods on Windows(1805 kB)2479 downloads
File information
File name FULLTEXT02.pdfFile size 1805 kBChecksum SHA-512
f4f9b3072510f65ee0b12e2117dc5c2d0cb36bb7f57bfb815d5f6d7d524968d65e1ac122499eab5a7bd8f45c465ef0f968bc8d44258e6b6f18848ff25b8dc91b
Type fulltextMimetype application/pdf

Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 2481 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 721 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf