Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The Normalised Compression Distance as a File Fragment Classifier
Responsible organisation
2010 (English)In: Digital Investigation. The International Journal of Digital Forensics and Incident Response, ISSN 1742-2876, E-ISSN 1873-202X, Vol. 7, no Suppl 1, p. S24-S31Article in journal (Refereed) Published
Abstract [en]

We have applied the generalised and universal distance measure NCD—Normalised Compression Distance—to the problem of determining the type of file fragments. To enable later comparison of the results, the algorithm was applied to fragments of a publicly available corpus of files. The NCD algorithm in conjunction with the k-nearest-neighbour (k ranging from one to ten) as the classification algorithm was applied to a random selection of circa 3000 512-byte file fragments from 28 different file types. This procedure was then repeated ten times. While the overall accuracy of the n-valued classification only improved the prior probability from approximately 3.5% to circa 32%–36%, the classifier reached accuracies of circa 70% for the most successful file types. A prototype of a file fragment classifier was then developed and evaluated on new set of data (from the same corpus). Some circa 3000 fragments were selected at random and the experiment repeated five times. This prototype classifier remained successful at classifying individual file types with accuracies ranging from only slightly lower than 70% for the best class, down to similar accuracies as in the prior experiment.

Place, publisher, year, edition, pages
Elsevier , 2010. Vol. 7, no Suppl 1, p. S24-S31
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-7671DOI: 10.1016/j.diin.2010.05.004ISI: 000281010700004Local ID: oai:bth.se:forskinfoB3816A81E8404C23C1257806003EA7EFOAI: oai:DiVA.org:bth-7671DiVA, id: diva2:835315
Available from: 2012-09-18 Created: 2010-12-27 Last updated: 2018-01-11Bibliographically approved

Open Access in DiVA

fulltext(136 kB)431 downloads
File information
File name FULLTEXT01.pdfFile size 136 kBChecksum SHA-512
da927c199c9d023eac60db8ec33ce453ff85b42b4ad65b33da1efeafbff41234d1ada49bee0fa18693930b31afca3762b4da3744894a567c275428076430d93f
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Axelsson, Stefan

Search in DiVA

By author/editor
Axelsson, Stefan
In the same journal
Digital Investigation. The International Journal of Digital Forensics and Incident Response
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 431 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 377 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf