Large-Scale Recognition of References within a Corpus of Scholarly Literature
Citation links, stored in a citation index or bibliographic database, constitute valuable data for multiple purposes of information retrieval.
While a manual capturing of citation links is costly, their automatic extraction is challenging in the face of unstructured digital-text documents.
This thesis provides an approach for automatic reference recognition, employing methods that leverage bibliographic metadata. The suitability of the approach is discussed, and a developed prototype presented.
The evaluation of the prototype shows that identifying the links of a large-scale, corpus-restricted citation network can be done with high effectiveness and efficiency.