Fundamental stock analysis with LLMs and qualitative data: development of a vector database for company reports
This project explores the use of natural language processing techniques, specifically Large Language Models (LLMs), for fundamental stock analysis by leveraging qualitative data in corporate financial reports and disclosures. It addresses the challenge of information overload faced by retail inve...
Saved in:
| Main Author: | |
|---|---|
| Format: | Final Year Project / Dissertation / Thesis |
| Published: |
2025
|
| Subjects: | |
| Online Access: | http://eprints.utar.edu.my/7241/1/fyp_CS_2025_TJJ.pdf http://eprints.utar.edu.my/7241/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1854094495457476608 |
|---|---|
| author | Ting, Jun Jing |
| author_facet | Ting, Jun Jing |
| author_sort | Ting, Jun Jing |
| building | UTAR Library |
| collection | Institutional Repository |
| content_provider | Universiti Tunku Abdul Rahman |
| content_source | UTAR Institutional Repository |
| continent | Asia |
| country | Malaysia |
| description | This project explores the use of natural language processing techniques,
specifically Large Language Models (LLMs), for fundamental stock analysis by
leveraging qualitative data in corporate financial reports and disclosures. It addresses
the challenge of information overload faced by retail investors by automating the
collection, processing, and interpretation of fundamental data. The system employs a
multi-agent architecture integrating web scraping of financial reports, LLM-based
report processing of lengthy documents, and embedding the resulting processed data
into a vector database to enable semantic search and efficient information retrieval.
Using vector embeddings and retrieval-augmented generation, the system acts as a
“virtual analyst” that retrieves relevant information and synthesizes coherent responses
to complex investor queries about a company’s fundamentals. The results demonstrate
that this LLM-driven approach efficiently distills key insights from enormous
unstructured texts, thereby making qualitative analysis more accessible and bridging
the gap in analytical capability for retail investors. The project provided a functional
proof of concept and highlighted opportunities for further improvements, including
expanding data sources, improving summary accuracy, and strengthening the system’s
real-time information integration capabilities. |
| format | Final Year Project / Dissertation / Thesis |
| id | my-utar-eprints.7241 |
| institution | Universiti Tunku Abdul Rahman |
| publishDate | 2025 |
| record_format | eprints |
| spelling | my-utar-eprints.72412025-12-29T09:57:58Z Fundamental stock analysis with LLMs and qualitative data: development of a vector database for company reports Ting, Jun Jing T Technology (General) This project explores the use of natural language processing techniques, specifically Large Language Models (LLMs), for fundamental stock analysis by leveraging qualitative data in corporate financial reports and disclosures. It addresses the challenge of information overload faced by retail investors by automating the collection, processing, and interpretation of fundamental data. The system employs a multi-agent architecture integrating web scraping of financial reports, LLM-based report processing of lengthy documents, and embedding the resulting processed data into a vector database to enable semantic search and efficient information retrieval. Using vector embeddings and retrieval-augmented generation, the system acts as a “virtual analyst” that retrieves relevant information and synthesizes coherent responses to complex investor queries about a company’s fundamentals. The results demonstrate that this LLM-driven approach efficiently distills key insights from enormous unstructured texts, thereby making qualitative analysis more accessible and bridging the gap in analytical capability for retail investors. The project provided a functional proof of concept and highlighted opportunities for further improvements, including expanding data sources, improving summary accuracy, and strengthening the system’s real-time information integration capabilities. 2025-06 Final Year Project / Dissertation / Thesis NonPeerReviewed application/pdf http://eprints.utar.edu.my/7241/1/fyp_CS_2025_TJJ.pdf Ting, Jun Jing (2025) Fundamental stock analysis with LLMs and qualitative data: development of a vector database for company reports. Final Year Project, UTAR. http://eprints.utar.edu.my/7241/ |
| spellingShingle | T Technology (General) Ting, Jun Jing Fundamental stock analysis with LLMs and qualitative data: development of a vector database for company reports |
| title | Fundamental stock analysis with LLMs and qualitative data: development of a vector database for company
reports
|
| title_full | Fundamental stock analysis with LLMs and qualitative data: development of a vector database for company
reports
|
| title_fullStr | Fundamental stock analysis with LLMs and qualitative data: development of a vector database for company
reports
|
| title_full_unstemmed | Fundamental stock analysis with LLMs and qualitative data: development of a vector database for company
reports
|
| title_short | Fundamental stock analysis with LLMs and qualitative data: development of a vector database for company
reports
|
| title_sort | fundamental stock analysis with llms and qualitative data: development of a vector database for company
reports |
| topic | T Technology (General) |
| url | http://eprints.utar.edu.my/7241/1/fyp_CS_2025_TJJ.pdf http://eprints.utar.edu.my/7241/ |
| url_provider | http://eprints.utar.edu.my |
