Informed Software Installation through License Agreement Categorization

Document type: Conference Papers
Peer reviewed: Yes
Full text:
Author(s): Anton Borg, Martin Boldt, Niklas Lavesson
Title: Informed Software Installation through License Agreement Categorization
Conference name: Information Security for South Africa
Year: 2011
ISBN: 978-1-4577-1482-5
Publisher: IEEE Press
City: Johannesburg
Organization: Blekinge Institute of Technology
Department: School of Computing (Sektionen för datavetenskap och kommunikation)
School of Computing S-371 79 Karlskrona
+46 455 38 50 00
http://www.bth.se/com
Authors e-mail: anton.borg@bth.se, martin.boldt@bth.se, niklas.lavesson@bth.se
Language: English
Abstract: Spyware detection can be achieved by using machinelearning techniques that identify patterns in the End User License Agreements (EULAs) presented by application installers. However, solutions have required manual input from the user with varying degrees of accuracy. We have implemented an automatic prototype for extraction and classification and used it to generate a large data set of EULAs. This data set is used to compare four different machine learning algorithms when classifying EULAs. Furthermore, the effect of feature selection is investigated and for the top two algorithms, we investigate optimizing the performance using parameter tuning. Our conclusion is that feature selection and performance tuning are of limited use in this context, providing limited performance gains. However, both the Bagging and the Random Forest algorithms show promising results, with Bagging reaching an AUC measure of 0.997 and a False Negative Rate of 0.062. This shows the applicability of License Agreement Categorization for realizing informed software installation.
Subject: Computer Science\Artificial Intelligence
Computer Science\Electronic security
Computer Science\General
Keywords: Parameter tuning, EULA analysis, Spyware, Automated detection
Edit