Dynamic Topic Modelling of Online Discussions on the Russian War in Ukraine

Ustyianovych, Taras; Falfushynska, Halina; Fedushko, Solomiia; Siemens, Eduard

Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: http://dx.doi.org/10.25673/112997

Langanzeige der Metadaten

DC Element	Wert	Sprache
dc.contributor.author	Ustyianovych, Taras	-
dc.contributor.author	Falfushynska, Halina	-
dc.contributor.author	Fedushko, Solomiia	-
dc.contributor.author	Siemens, Eduard	-
dc.date.accessioned	2024-01-10T09:14:40Z	-
dc.date.available	2024-01-10T09:14:40Z	-
dc.date.issued	2023-11-30	-
dc.identifier.uri	https://opendata.uni-halle.de//handle/1981185920/114954	-
dc.identifier.uri	http://dx.doi.org/10.25673/112997	-
dc.identifier.uri	http://dx.doi.org/10.25673/112997	-
dc.description.abstract	The availability of robust end-to-end ML processes plays a crucial role in delivering an accurate and reliable system for real-time text data inference. In this paper, we present an approach to building machine learning operations (MLOps) and an observability application to perform topic modelling of online discussions in social media, here observed based on topics and threads related to the Russian war in Ukraine. Splunk Enterprise is the main tool and platform used throughout this research with its knowledge discovery, dashboarding, and alerting. 30GB of social media text data coming from a Russian social network VKontakte over the time line January 2022 to May 2023.Main inquiries included text mining and topic modelling, which we managed to perform over the observation period using Python frameworks, mainly gensim for text processing and MLflow for experiment management and logging. The Splunk architecture allowed us to ingest and analyse the results and prediction of ML experiments for dynamic topic modelling, and served as a MLOps solution. The designed set of five dashboards played a crucial role in determining the optimal model hyperparameters (number of topics, A-priori belief on document-topic distribution, number of total corpus passes) and drift detection which occurred almost every two-three weeks depending on the phase of the war. Our application assisted us with text analysis, discovering how events on the battlefield influenced social media discussions, and what post attributes contributed to a high user engagement. With our setup we were able to find out how antiwar hashtags have been used to promote misleading content actually supporting the war against Ukraine. The analysis of the researched discussions shows a trend where usage of adjectives decreased over time since the war has started, whereas an increase for nouns and verbs usage over time. Information distortion has steadily been present in the content leading to bias and misleading data in social media discussions.	-
dc.language.iso	eng	-
dc.rights.uri	https://creativecommons.org/licenses/by-sa/4.0/	-
dc.subject	Machine Learning Operations (MLOps)	-
dc.subject	Social Media Discussions	-
dc.subject	Russian War in Ukraine	-
dc.subject.ddc	006.31	-
dc.title	Dynamic Topic Modelling of Online Discussions on the Russian War in Ukraine	-
local.versionType	publishedVersion	-
local.publisher.universityOrInstitution	Hochschule Anhalt	-
local.openaccess	true	-
dc.identifier.ppn	1873208146	-
cbs.publication.displayform	2023	-
local.bibliographicCitation.year	2023	-
cbs.sru.importDate	2024-01-10T09:13:15Z	-
local.bibliographicCitation	Enthalten in Proceedings of the 11th International Conference on Applied Innovations in IT - Köthen, Germany : Edition Hochschule Anhalt, 2023	-
local.accessrights.dnb	free	-
Enthalten in den Sammlungen:	International Conference on Applied Innovations in IT (ICAIIT)

Dateien zu dieser Ressource:

Datei	Beschreibung	Größe	Format
2_7_ICAIIT_Paper_2023(2)_Ustyianovych_31-1.pdf		1.44 MB	Adobe PDF	Öffnen/Anzeigen

Zur Kurzanzeige BibTeX EndNote