Accurate Adware Detection using Opcode Sequence Extraction

Document type: Conference Papers
Peer reviewed: Yes
Full text:
Author(s): Raja Khurram Shahzad, Niklas Lavesson, Henric Johnson
Title: Accurate Adware Detection using Opcode Sequence Extraction
Conference name: Sixth International Conference on Availability, Reliability and Security
Year: 2011
Pagination: 189-195
ISBN: 978-0-7695-4485-4/11
Publisher: IEEE Press
City: Vienna
URI/DOI: 10.1109/ARES.2011.35
Organization: Blekinge Institute of Technology
Department: School of Computing (Sektionen för datavetenskap och kommunikation)
School of Computing S-371 79 Karlskrona
+46 455 38 50 00
Authors e-mail:,
Language: English
Abstract: Adware represents a possible threat to the security and privacy of computer users. Traditional signature-based and heuristic-based methods have not been proven to be successful at detecting this type of software. This paper presents an adware detection approach based on the application of data mining on disassembled code. The main contributions of the paper is a large publicly available adware data set, an accurate adware detection algorithm, and an extensive empirical evaluation of several candidate machine learning techniques that can be used in conjunction with the algorithm. We have extracted sequences of opcodes from adware and benign software and we have then applied feature selection, using different configurations, to obtain 63 data sets. Six data mining algorithms have been evaluated on these data sets in order to find an efficient and accurate detector. Our experimental results show that the proposed approach can be used to accurately detect both novel and known adware instances even though the binary difference between adware and legitimate software is usually small.
Subject: Computer Science\Artificial Intelligence
Computer Science\Electronic security
Computer Science\General
Keywords: Data Mining, Adware Detection, Binary Classification, Static Analysis, Disassembly, Instruction Sequences