YSU_DSB_thesis
annpetrosiann/YSU_DSB_thesis
This repo contains the full pipeline for my Master's thesis at Yerevan State University (YSU), developed as part of the Data Science for Business master's program. The goal of this project is to build an end-to-end Retrieval-Augmented Generation (RAG) system using semantic search, LLMs, and fine-tuned embeddings on Armenian banks’ financial PDFs.
Summary
A Master's thesis project implementing a full Retrieval-Augmented Generation (RAG) pipeline for querying Armenian bank financial documents. It includes PDF processing, text chunking, embedding generation with optional fine-tuning, semantic search via FAISS, and response generation using a local LLM via Ollama. The system is evaluated with standard NLP metrics and is designed as an end-to-end, modular application.