Inlämning av Examensarbete / Submission of Thesis

Mikael Hellborg Lapajne; Daniel Slat MSE-2010:18, pp. 39. COM/School of Computing, 2010.

The work

Författare / Author: Mikael Hellborg Lapajne, Daniel Slat
mikelapajne@gmail.com, daniel.slat@gmail.com
Titel / Title: Random Forests for CUDA GPUs
Abstrakt Abstract:

Context. Machine Learning is a complex and resource consuming process that requires a lot of computing power. With the constant growth of information, the need for efficient algorithms with high performance is increasing. Today's commodity graphics cards are parallel multi processors with high computing capacity at an attractive price and are usually pre-installed in new PCs. The graphics cards provide an additional resource to be used in machine learning applications. The Random Forest learning algorithm which has been showed competitive within machine learning has a good potential for performance increase through parallelization of the algorithm.
Objectives. In this study we implement and review a revised Random Forest algorithm for GPU execution using CUDA.
Methods. A review of previous work in the area has been done by studying articles from several sources, including Compendex, Inspec, IEEE Xplore, ACM Digital Library and Springer Link. Additional information regarding GPU architecture and implementation specific details have been obtained mainly from documentation available from Nvidia and the Nvidia developer forums.
The implemented algorithm has been benchmarked and compared with two state-of-the-art CPU implementations of the Random Forest algorithm, both regarding consumed time for training and classification and for classification accuracy.
Results. Measurements from benchmarks made on the three different algorithms are gathered showing the performance results of the algorithms for two publicly available data sets.
Conclusion. We conclude that our implementation under the right conditions is able to outperform its competitors. We also conclude that this is only true for certain data sets depending on the size of the data sets. Moreover we conclude that there is potential for further improvements of the algorithm both regarding performance as well as adaption towards a wider range of real world applications.

Ämnesord / Subject: Datavetenskap - Computer Science\Software Engineering

Nyckelord / Keywords: CUDA, Random forests, Parallel computing, Graphics processing units

Publication info

Dokument id / Document id:
Program:/ Programme Civilingenjör i datateknik, programvaruteknik/
Registreringsdatum / Date of registration: 06/15/2010
Uppsatstyp / Type of thesis: Masterarbete/Master's Thesis (120 credits)

Context

Handledare / Supervisor: Håkan Grahn
hakan.grahn@bth.se
Examinator / Examiner: Tony Gorschek
Organisation / Organisation: Blekinge Institute of Technology
Institution / School: COM/School of Computing

+46 455 38 50 00
Anmärkningar / Comments:

Mikael: +46768539263,
Daniel: +46703040693

Files & Access

Bifogad uppsats fil(er) / Files attached: random.forests.for.cuda.gpus.pdf (1326 kB, öppnas i nytt fönster)