Towards Domain-Grounded Retrieval: Graph Based Prompting for Bug Reports
2025 (English)Independent thesis Basic level (professional degree), 12 credits / 18 HE credits
Student thesis
Abstract [en]
Background: Retrieval-Augmented Generation (RAG) systems combine retrieval techniques with language models to produce contextually relevant responses. However, these systems often lack domain-specific grounding and interpretability, which are crucial in fields like software engineering and telecommunications.
Objectives: This thesis explores the application of a GraphRAG approach for root cause identification in a software system. It investigates whether integrating a Knowledge Graph (KG) with an LLM can enhance the relevance, accuracy, and interpretability of generated responses. Additionally, it examines whether domain experts find value in combining Large Language Models (LLMs) with an interactive, structured representation of the system under analysis.
Methods: The proposed GraphRAG system is compared to a hybrid black-box RAG system, which combines vector-based retrieval with symbolic methods. While hybri-RAG approaches may enhance retrieval precision, they often lack transparency and domain alignment. In contrast, the proposed GraphRAG system leverages structured knowledge from a Neo4j-based graph to ground retrieval and improve interpretability. The evaluation involved domain experts assessing the outputs of both systems, focusing on accuracy, relevance, and interpretability.
Results: The GraphRAG system (System B) outperformed the hybrid vector-based baseline (System A) in tasks such as faulty component detection and identifying recurring issues. Expert feedback emphasized that System B provided better interpretability and stronger domain alignment. However, limitations were noted in cases where inputs were weak or task framing was ambiguous.
Conclusions: Expert feedback confirms the value of structured knowledge integration, although task context and input quality can influence system performance. This work contributes to advancing knowledge-intensive domains, demonstrating that structured retrieval can enhance generative system outputs.
Place, publisher, year, edition, pages
2025. , p. 52
Keywords [en]
Large Language Models, Bug Reports, Retrieval-Augmented Generation, Domain-Grounded Retrieval
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:bth-28162OAI: oai:DiVA.org:bth-28162DiVA, id: diva2:1974105
External cooperation
Ericsson
Subject / course
DV1583 Degree Project for Bachelor of Science in Engineering (Computer Science)
Educational program
Bachelor of Science in Engineering: Computer Security
Supervisors
Examiners
2025-07-042025-06-212026-03-30Bibliographically approved