Creation of a Prototype to Combine and Evaluate Current Intrinsic Plagiarism Detection Algorithms

The goal of this master thesis is to implement, combine and evaluate current approaches in the field of intrinsic plagiarism detection using algorithms developed in the thesis. Moreover a web based prototype should be created, which allows an interactive validation of text documents according to possible plagiarism.

Concretely this work considers six different approaches of finding plagiarism, which among others differ by their selection and usage of stylometric features. All approaches should have to possibility to be combined with each other. The evaluation is based on the PAN 2011 test corpus, which includes over 4000 english documents.