David Eriksson MSE-2002:17, pp. 25. Inst. för programvaruteknik och datavetenskap/Dept. of Software Engineering and Computer Science, 2002.
Decompilation, or reverse compilation, takes a computer program and produces high-level code that
works like the original source code. This makes it easier to understand a computer program when source
code is not available. However, there are very few tools for decompilation available today. This report
describes the design and implementation of Desquirr, a decompilation plug-in for Interactive Disassembler
Pro. Desquirr has an object-oriented design and performs basic decompilation of programs running on
Intel x86 processors.
The low-level analysis uses knowledge about specialized compiler constructs, called idioms, to perform
a more accurate decompilation. Desquirr implements data flow analysis, meaning the conversion from
primitive machine code instructions into code in a high-level language. The major part of the data flow
analysis is the Register Copy Propagation which builds high-level expressions from primitive instructions.
Control flow analysis, meaning to restore high-level language constructs such as if/else and for loops, is
A high level representation of a piece of machine code contains the same information as an assembly
language representation of the same machine code, but in a format that is easier to comprehend. Symbols
such as ?*? and ?+? are used in high-level language expressions, compared to instructions such as ?mul?
and ?add? in assembly language. Two small test cases which compares decompiled code with assembly
language shows promising results in reducing the amount of information needed to comprehend a program.