Dynamic Topic Modelling of Online Discussions on the Russian War in Ukraine

Ustyianovych, Taras; Falfushynska, Halina; Fedushko, Solomiia; Siemens, Eduard

Please use this identifier to cite or link to this item: http://dx.doi.org/10.25673/112997

Title:	Dynamic Topic Modelling of Online Discussions on the Russian War in Ukraine
Author(s):	Ustyianovych, Taras Falfushynska, Halina Fedushko, Solomiia Siemens, Eduard
Granting Institution:	Hochschule Anhalt
Issue Date:	2023-11-30
Language:	English
Subjects:	Machine Learning Operations (MLOps) Social Media Discussions Russian War in Ukraine
Abstract:	The availability of robust end-to-end ML processes plays a crucial role in delivering an accurate and reliable system for real-time text data inference. In this paper, we present an approach to building machine learning operations (MLOps) and an observability application to perform topic modelling of online discussions in social media, here observed based on topics and threads related to the Russian war in Ukraine. Splunk Enterprise is the main tool and platform used throughout this research with its knowledge discovery, dashboarding, and alerting. 30GB of social media text data coming from a Russian social network VKontakte over the time line January 2022 to May 2023.Main inquiries included text mining and topic modelling, which we managed to perform over the observation period using Python frameworks, mainly gensim for text processing and MLflow for experiment management and logging. The Splunk architecture allowed us to ingest and analyse the results and prediction of ML experiments for dynamic topic modelling, and served as a MLOps solution. The designed set of five dashboards played a crucial role in determining the optimal model hyperparameters (number of topics, A-priori belief on document-topic distribution, number of total corpus passes) and drift detection which occurred almost every two-three weeks depending on the phase of the war. Our application assisted us with text analysis, discovering how events on the battlefield influenced social media discussions, and what post attributes contributed to a high user engagement. With our setup we were able to find out how antiwar hashtags have been used to promote misleading content actually supporting the war against Ukraine. The analysis of the researched discussions shows a trend where usage of adjectives decreased over time since the war has started, whereas an increase for nouns and verbs usage over time. Information distortion has steadily been present in the content leading to bias and misleading data in social media discussions.
URI:	https://opendata.uni-halle.de//handle/1981185920/114954 http://dx.doi.org/10.25673/112997 http://dx.doi.org/10.25673/112997
Open Access:	Open access publication
License:	(CC BY-SA 4.0) Creative Commons Attribution ShareAlike 4.0
Appears in Collections:	International Conference on Applied Innovations in IT (ICAIIT)

Files in This Item:

File	Description	Size	Format
2_7_ICAIIT_Paper_2023(2)_Ustyianovych_31-1.pdf		1.44 MB	Adobe PDF	View/Open

Show full item record BibTeX EndNote