Hannes Tribus MSE-2010-16, pp. 55. COM/School of Computing, 2010.
Delivering fault free code is the clear goal of each devel- oper, however the best method to achieve this aim is still an open question. Despite that several approaches have been proposed in literature there exists no overall best way. One possible solution proposed recently is to combine static source code analysis with the discipline of machine learn- ing. An approach in this direction has been defined within this work, implemented as a prototype and validated subse- quently. It shows a possible translation of a piece of source code into a machine learning algorithm’s input and further- more its suitability for the task of fault detection. In the context of the present work two prototypes have been de- veloped to show the feasibility of the presented idea. The output they generated on open source projects has been collected and used to train and rank various machine learn- ing classifiers in terms of accuracy, false positive and false negative rates. The best among them have subsequently been validated again on an open source project. Out of the first study at least 6 classifiers including “MultiLayerPer- ceptron”, “Ibk” and “ADABoost” on a “BFTree” could convince. All except the latter, which failed completely, could be validated in the second study. Despite that the it is only a prototype, it shows the suitability of some machine learning algorithms for static source code analysis.