NATURAL LANGUAGE PROCESSING APPROACH TO DETECTING SPONSORSHIP OF VIDEO CONTENT

Authors

  • Askarbek Aslan Master Student. Kazakh-British Technical University. Faculty of IT and Engineering. Almaty. Kazakhstan. ORCID: 0009-0009-1062-1314

Keywords:

Artificial Intelligence, AI, NLP, Natural Language Processing, ML, Machine learning, data science, data mining, content detection

Abstract

This study examined the possibility of using NLP software for automatic processing of such data to analyze the intention, perception and reaction to video messages to society, the end consumer of messages. The main goal is to extend the NLP approach in identifying video content sponsorship details of YouTube platform video-bank. The theoretical and practical significance of the methods proposed in the study designs the use of several metrics and models, where the example describes the practice of using and training a machine to recognize and read the content of a sponsored video material block. The studied example solves many relevant problems connecting to sponsorship information. The novelty of this research designs the unique way of improving existing basic NLP models that are designed to solve basic problems, but require separate specific approaches to NLP solutions. In particular, as a result obtained in this study, it was trained the basic model using the XGBoost algorithm and a customized pretrained BERT model: the Boosting-based model was taught with a vector representation of words by Term Frequency-Inverse Document Frequency Metric; further trained TF-IDF were labeled sponsored sentences, this approach was found to be intuitive in understanding common words present in the target sentences of the study; also, the BERT model was trained under-sampled unsponsored data using a pre-trained tokenizer for BERT and training the taught model for several epochs. The resulting data was shown to be consistent in having a meaningful resulting probability of over 70%

Published

2025-05-19

How to Cite

Askarbek Aslan. (2025). NATURAL LANGUAGE PROCESSING APPROACH TO DETECTING SPONSORSHIP OF VIDEO CONTENT. Research Retrieval and Academic Letters, (9). Retrieved from https://ojs.scipub.de/index.php/RRAL/article/view/6133