Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Detection of Spyware by Mining Executable Files
Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation.
Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation.
2009 (Engelska)Självständigt arbete på avancerad nivå (masterexamen)Studentuppsats (Examensarbete)
Abstract [en]

Malicious programs have been a serious threat for the confidentiality, integrity and availability of a system. Different researches have been done to detect them. Two approaches have been derived for it i.e. Signature Based Detection and Heuristic Based Detection. These approaches performed well against known malicious programs but cannot catch the new malicious programs. Different researchers tried to find new ways of detecting malicious programs. The application of data mining and machine learning is one of them and has shown good results compared to other approaches. A new category of malicious programs has gained momentum and it is called Spyware. Spyware are more dangerous for confidentiality of private data of the user of system. They may collect the data and send it to third party. Traditional techniques have not performed well in detecting Spyware. So there is a need to find new ways for the detection of Spyware. Data mining and machine learning have shown promising results in the detection of other malicious programs but it has not been used for detection of Spyware yet. We decided to employ data mining for the detection of spyware. We used a data set of 137 files which contains 119 benign files and 18 Spyware files. A theoretical taxonomy of Spyware is created but for the experiment only two classes, Benign and Spyware, are used. An application Binary Feature Extractor have been developed which extract features, called n-grams, of different sizes on the basis of common feature-based and frequency-based approaches. The number of features were reduced and used to create an ARFF file. The ARFF file is used as input to WEKA for applying machine learning algorithms. The algorithms used in the experiment are: J48, Random Forest, JRip, SMO, and Naive Bayes. 10-fold cross-validation and the area under ROC curve is used for the evaluation of classifier performance. We performed experiments on three different n-gram sizes, i.e.: 4, 5, 6. Results have shown that extraction of common feature approach has produced better results than others. We achieved an overall accuracy of 90.5 % with an n-gram size of 6 from the J48 classifier. The maximum area under ROC achieved was 83.3 % with Random Forest.

Ort, förlag, år, upplaga, sidor
2009. , s. 52
Nyckelord [en]
Spyware Detection, Data Mining, Machine Learning, Feature Extraction, WEKA, ARFF
Nationell ämneskategori
Datavetenskap (datalogi) Sannolikhetsteori och statistik
Identifikatorer
URN: urn:nbn:se:bth-3095Lokalt ID: oai:bth.se:arkivex18D18DDACF0ED590C12575D80035524AOAI: oai:DiVA.org:bth-3095DiVA, id: diva2:830393
Uppsök
fysik/kemi/matematik
Handledare
Anmärkning
+46709325761, +46762782550Tillgänglig från: 2015-04-22 Skapad: 2009-06-17 Senast uppdaterad: 2018-01-11Bibliografiskt granskad

Open Access i DiVA

fulltext(387 kB)547 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 387 kBChecksumma SHA-512
6fa08199b679460c1eaae379cf9367adf9f21691eefb0cd52653fa99b2d1d5f27caf6f741ee48e7e79dceab364f22c630364c5fe03c943e28151c3feacf6031f
Typ fulltextMimetyp application/pdf

Av organisationen
Sektionen för datavetenskap och kommunikation
Datavetenskap (datalogi)Sannolikhetsteori och statistik

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 547 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 183 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf