Sandra Olandersson MCS-2003:12, pp. 62. Inst. för programvaruteknik och datavetenskap/Dept. of Software Engineering and Computer Science, 2003.
The classification of protein sequences is a subfield in the area of Bioinformatics that attracts a substantial interest today. Machine Learning algorithms are here believed to be able to improve the performance of the classification phase.
This thesis considers the application of different Machine Learning algorithms to the classification problem of a data set of short-chain dehydrogenases/reductases (SDR) proteins. The classification concerns both the division of the proteins into the two main families, Classic and Extended, and into their different subfamilies. The results of the different algorithms are compared to select the most appropriate algorithm for this particular classification problem.
372 38 Ronneby