Large Language Models for Software Development: Evaluating the Feasibility of Local Large Language Models for Code Generation

Buscaglia Uchaneishvili, Jordi

Title:	Large Language Models for Software Development: Evaluating the Feasibility of Local Large Language Models for Code Generation
Language:	English
Authors:	Buscaglia Uchaneishvili, Jordi
Keywords:	Retrieval-Augmented Generation; Large Language Model; User Interface; Continuous Bag of Words; Continuous Skip-gram Model
Issue Date:	3-Dec-2025
Abstract:	Die Fähigkeit von großen Sprachmodellen natürliche Sprache in Code umzuwandeln, hat sich in den letzten Iterationen erheblich verbessert. Bestehende Methoden wie GitHub Copilot zeigen, dass ihre Benutzung äußerst hilfreich sein kann, jedoch ihre Funktionalität nur über externe Anbieter zugänglich ist. Dies kann ein Problem für Unternehmen mit strengen Datenschutz- oder Compliance-Anforderungen sein. Eine Alternative dazu könnten lokale Sprachmodelle sein. Lokale Sprachmodelle, die Code unter Verwendung von unternehmensinternen Code-Repositories generieren können, können insbesondere für Unternehmen mit strengen Datenschutzanforderungen von Vorteil sein. Das Ziel dieser Bachelorarbeit ist es, die Machbarkeit von lokalen Sprachmodellen zur Code- Generierung zu untersuchen. Diese Arbeit untersucht, inwieweit Retrieval-Augmented Generation genutzt werden kann, um den notwendigen Code aus dem unternehmens-internen Repository als Kontext zu holen. Mit dem Kontext soll das Sprachmodell anschließend Code generieren. Zusätzlich wird in dieser Arbeit untersucht, inwieweit Sprachmodelle mithilfe von Filament- Code und der Filament-Dokumentation durch Fine-Tuning trainiert werden können. Abschließend wird diese Arbeit untersuchen, wie die Kombination von RAG und Fine-Tuning genutzt werden kann, um die Leistung in der Codegenerierung zu steigern. The ability of Large Language Models (LLMs) to transform natural language into code has significantly improved over the last iterations. Existing methods, such as GitHub Copilot, demonstrate that their usability is highly robust, but their functionality can only be accessed through external providers. This can pose a problem for companies with strict data protection or compliance requirements. An alternative to this could be local LLMs. Local code generation LLMs that can generate code using company internal code repository can be particularly beneficial for companies with stringent data protection requirements. The motivation of this bachelor thesis is therefore to investigate the feasibility of using local LLMs for code generation. This thesis investigates the extent to which Retrieval-Augmented Generation (RAG) can be used, to retrieve the necessary code from the companies internal repository to generate code. It will also explore how RAG can enhance the usability and relevance of LLM-generated outputs. Additionally, this thesis will investigate the extent to which LLMs can be fine-tuned using examples from private code repositories, such as Philips’ internal repository for User Interface (UI) components, known as Filament. Finally, this thesis will examine how the combination of both RAG and fine-tuning on a companies private repository can be leveraged to maximize performance for code generation tasks.
URI:	https://hdl.handle.net/20.500.12738/18438
Institute:	Department Fahrzeugtechnik und Flugzeugbau (ehemalig, aufgelöst 10.2025) Fakultät Technik und Informatik (ehemalig, aufgelöst 10.2025)
Type:	Thesis
Thesis type:	Bachelor Thesis
Advisor:	Wilke, Robin
Referee:	Islam, Sami
Appears in Collections:	Theses