Emil Gedda & Kalle Lindqvist , pp. 54. COM/School of Computing, 2011.
Context. Custom solutions to optical character recognition problems are able to reach higher recognition rates then a generic solution by their ability to exploiting the limitations in the problem domain. Such solutions can be generated with genetic algorithms. This thesis evaluates two different chromosome encodings on an optical character recognition problem with a limited problem domain.
Objectives. The main objective for this study is to compare two different chromosome encodings used in a genetic algorithm generating neural networks for an optical character recognition problem to evaluate both the impact on the evolution of the network as well as the networks produced.
Methods. A systematic literature review was conducted to find genetic chromosome encodings previously used on similar problem. One well documented chromosome encoding was found. We implemented the found hromosome ncoding called binary, as well as a modified version called weighted binary, which intended to reduce the risk of bad mutations. Both chromosome encodings were evaluated on an optical character recognition problem with a limited problem domain. The experiment was run with two different population sizes, ten and fifty. A baseline for what to consider a good solution on the problem was acquired by implementing a template matching classifier on the same dataset. Template matching was chosen since it is used in existing solutions on the same problem.
Results. Both encodings were able to reach good results compared to the baseline. The weighted binary encoding was able to reduce the problem with bad mutations which occurred in the binary encoding. However it also had a negative impact on the ability of finding the best networks. The weighted binary encoding was more prone to enbreeding with a small population than the binary encoding. The best network generated using the binary encoding had a 99.65% recognition rate while the best network generated by the weighted binary encoding had a 99.55% recognition rate.
Conclusions. We conclude that it is possible to generate many good solutions for an optical character problem with a limited problem domain. Even though it is possible to reduce the risk of bad mutations in a genetic lgorithm generating neural networks used for optical character recognition by designing the chromosome encoding, it may be more harmful than not doing it.