Using Normalized Compression Distance for Classifying File Fragments
| Document type: | Conference Papers |
|---|---|
| Peer reviewed: | Yes |
| Author(s): | Stefan Axelsson |
| Title: | Using Normalized Compression Distance for Classifying File Fragments |
| Conference name: | 5th International Conference on Availability, Reliability and Security |
| Year: | 2010 |
| Pagination: | 641-646 |
| ISBN: | 978-0-7695-3965-2 |
| Publisher: | IEEE |
| City: | Cracow |
| URI/DOI: | 10.1109/ARES.2010.100 |
| ISI number: | 000278197800098 |
| Organization: | Blekinge Institute of Technology |
| Department: | School of Computing (Sektionen för datavetenskap och kommunikation) School of Computing S-371 79 Karlskrona +46 455 38 50 00 http://www.bth.se/com |
| Language: | English |
| Abstract: | We have applied the generalised and universal distance measure NCD-Normalised Compression Distance-to the problem of determining the types of file fragments via example. A corpus of files that can be redistributed to other researchers in the field was developed and the NCD algorithm using k-nearest-neighbour as the classification algorithm was applied to a random selection of file fragments. The experiment covered circa 2000 fragments from 17 different file types. While the overall accuracy of the n-valued classification only improved the prior probability of the class from approximately 6% to circa 50% overall, the classifier reached accuracies of 85%-100% for the most successful file types. |
| Subject: | Software Engineering\General |
| Keywords: | FOR-DMIN, APP-IDEN |












