Daniel Åkerud; Henrik Rendlo MSE-2004:21, pp. 67. TEK/avd. för programvaruteknik, 2004.
This thesis is intended to deal with questions related to the processing of naturally occurring texts, also known as natural language processing (NLP). The subject will be approached from a software engineering perspective, and the problem description will be formulated thereafter. The thesis is roughly divided into two major parts. The first part contains a literature study covering fundamental concepts and algorithms. We discuss both serial and parallel architectures, and conclude that different scenarios call for different architectures. The second part is an empirical evaluation of an NLP framework or toolkit chosen amongst a few, conducted in order to elucidate the theoretical part of the thesis. We argue that component based development in a portable language could increase the reusability in the NLP community, where reuse is currently low. The recent emergence of the discovered initiatives and the great potential of many applications in this area reveal a bright future for NLP.