BUILDING A VIETNAMESE MATH CHATBOT BASED ON RAG AND LLM: SYSTEM DESIGN, IMPLEMENTATION AND EXPERIMENTAL EVALUATION

Authors

  • Pham Van Khanh Institute of Information Technology, Vietnam Academy of Science and Technology, Hanoi city, Vietnam
  • Pham Vu Anh Tuan OPT Student of the Department of Computer Science, DePauw University, Indiana State, The USA

DOI:

https://doi.org/10.18173/2354-1059.2025-0053

Keywords:

Math Chatbot, generative artificial intelligence, RAG, large language models, Milvus, vLLM, LangGraph, mathematics education

Abstract

In recent years, large language models (LLM) and Retrieval-Augmented Generation (RAG) techniques have opened up new opportunities for the development of intelligent learning assistant systems. Nevertheless, the direct application of LLMs to Vietnamese Mathematics still has several limitations, including the illusion effect, a lack of knowledge base, failures to adhere to the Vietnamese education curricula, and difficulty in proceessing complex-related image issues. This paper presents the design, implementation, and experimental evaluation of a Vietnamese Mathematics Chatbot system based on the RAG architecture combined with LLM. The system comprises: (i) a pineline for collecting and standardizing Mathematics data from textbooks, exam questions and reference materials; (ii) a Milvus vector database to store embeddings generated by the BGE-m3 model; (iii) a multi-task pipeline coordinated by LangGraph; (iv) the inference component uses the Qwen3-VL-8B model implemented via vLLM; and (v) the WebUI user interface supports multimodal queries (text + image). Experimental results show that the system competently delivers detailed solutions, maintains the conversation flow, and significantly mitigates hallucination compared to pure LLM. The system also demonstrates potential for application in teaching and learning Mathematics, especially in situations requiring accurate knowledge retrieval and step-by-step explanations. These results suggest directions towards developing specialized learning assistants within the context of Vietnamese education.

References

[1] Bengio Y, Courville A & Vincent P, (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798-1828.

[2] Mikolov T, Chen K, Corrado G & Dean J, (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

[3] Winkler S & Söllner A, (2022). Unleashing the potential of chatbots in education: A systematic review. Computers & Education: Artificial Intelligence, 3, 100074, 2022.

[4] Huang R & et al., (2022). Chatbots for language learning: A meta-analysis. Educational Research Review, 37, 100487, 2022.

[5] Druga L & et al., (2017). Growing up with AI: Cognition and creativity in children's interactions with intelligent agents. Proceedings of the 2017 ACM Interaction Design and Children Conference, 351-362.

[6] Ji S, Zhang R, Wei B & Ma X, (2023). Survey of hallucination in natural language generation. arXiv preprint arXiv:2302.0807.

[7] Nguyen A, Yosinski K & Clune J, (2015). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 427-436.

[8] Lewis P & et al., (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems, 33, 9459-9474.

[9] Karpukhin J & et al., (2020). Dense passage retrieval for open-domain question answering. EMNLP, 6769-6781.

[10] Xiao X & et al., (2024). BGE-M3: A multi-function, multi-lingual and multi-granularity text embedding model. arXiv preprint arXiv:2402.07872.

[11] Qwen Team, (2023). Qwen-VL: A comprehensive multimodal foundation model. arXiv preprint arXiv:2308.02949.

[12] LangChain AI, (2024). LangGraph: State-graph workflows for LLM applications. Available: https://python.langchain.com/docs/langgraph/.

Downloads

Published

30-12-2025

How to Cite

Van Khanh, P., & Vu Anh Tuan, P. (2025). BUILDING A VIETNAMESE MATH CHATBOT BASED ON RAG AND LLM: SYSTEM DESIGN, IMPLEMENTATION AND EXPERIMENTAL EVALUATION. Journal of Science Natural Science, 70(4), 43-56. https://doi.org/10.18173/2354-1059.2025-0053