ADAPTATION OF BIG DATA TO LOCAL INFORMATION LANGUAGE MODELS: DEVELOPMENT OF THE BIGTOR CHATBOT SYSTEM

Authors

  • Farid Rustam Adgozalov Azerbaijan State Oil and Industry University, Baku, Azerbaijan, Master student

Keywords:

Large Language Models, Fine-tuning, Synthetic Data, Specialized Chatbots, Cultural Preservation

Abstract

This paper presents the development of BigTor, a domain-specific chatbot designed to address cultural, administrative, and social information gaps in Azerbaijan. To overcome the limitations of general-purpose models in low-resource languages, the DeepSeek-R1-Distill-Llama-8B model was selected as the base architecture. The system was fine-tuned using a high-quality synthetic dataset and Parameter-Efficient Fine-Tuning methodologies. The training process employed LoRA adaptation, 4-bit quantization, and bfloat16 precision to ensure computational efficiency. Experimental results demonstrate that BigTorV1 achieved 92 percent accuracy in the national music domain, significantly outperforming the baseline model.

Published

2026-05-24

How to Cite

Farid Rustam Adgozalov. (2026). ADAPTATION OF BIG DATA TO LOCAL INFORMATION LANGUAGE MODELS: DEVELOPMENT OF THE BIGTOR CHATBOT SYSTEM. Reviews of Modern Science, (13). Retrieved from https://ojs.scipub.de/index.php/RMS/article/view/8735

Issue

Section

Technical Sciences