Design und Evaluation eines personalisierten RAG-Systems für Unternehmenswissen

Thesis Type Bachelor
Thesis Status
Currently running
Student Christopf Haidegger
Init
Start
Thesis Supervisor
Contact

Large Language Models (LLMs) are increasingly used to address everyday information needs within professional environments. Their deployment is often restricted due to data-protection requirements, and their responses typically lack usefulness because the models do not have access to domain-specific internal knowledge.

This thesis presents the development of a Retrieval-Augmented Generation (RAG) chatbot designed to address both challenges by integrating company-internal data into the response generation process. The work investigates multiple retrieval strategies to determine which methods deliver the most relevant document passages and how strongly they improve LLM answer quality compared to a non-RAG setup. All LLM components rely on locally deployed models to ensure full data control. For evaluation, the implemented prototype is tested using internal data from ORF Landesstudio Tirol. User feedback is collected for both RAG-enabled and non-RAG configurations, assessing response quality, time required to obtain the desired information, and overall correctness. The collected user data is analyzed to determine which retrieval strategies perform most effectively under realistic working conditions and to uncover the factors driving their performance. These evaluation results form a reliable basis for deciding whether retrieval augmentation is beneficial and, if so, which specific methods are most appropriate for comparable professional settings. Integrating the developed system into a fully featured chatbot application aims to provide ORF editors with a personalized and efficient daily tool. This final implementation is designed to enhance newsroom workflows by delivering fast and reliable access to internal knowledge within a user-friendly environment.