Deep learning-based sentiment classification in amharic using multi-lingual datasets /

Senait Gebremichael Tesfagergish; Robertas Damaševičius; Jurgita Kapočiūtė-Dzikienė

doi:10.2298/CSIS230115042T

Title	Deep learning-based sentiment classification in amharic using multi-lingual datasets /
Authors	Gebremichael Tesfagergish, Senait ; Damaševičius, Robertas ; Kapočiūtė-Dzikienė, Jurgita
DOI	10.2298/CSIS230115042T
Full Text
Is Part of	Computer science and information systems.. Novi Sad : ComSIS consortium. 2023, vol. 20, iss. 4, p. 1459-1481.. ISSN 1820-0214. eISSN 2406-1018
Keywords [eng]	sentiment analysis ; monolingual vs.cross-lingual approaches ; deep learning, sentence transformers ; Amharic
Abstract [eng]	The analysis of emotions expressed in natural language text, also known as sentiment analysis, is a key application of natural language processing (NLP). It involves assigning a positive, negative (sometimes also neutral) value to opinions expressed in various contexts such as social media, news, blogs, etc. Despite its importance, sentiment analysis for under-researched languages like Amharic has not received much attention in NLP yet due to the scarcity of resources required to train such methods. This paper examines various deep learning methods such as CNN, LSTM, FFNN, BiLSTM, and transformers, as well as memory-based methods like cosine similarity, to perform sentiment classification using the word and sentence embedding techniques. This research includes training and comparing mono-lingual and cross-lingual models using social media messages in Amharic on Twitter. The study concludes that the lack of training data in the target language is not a significant issue since the training data 1) can be machine translated from other languages using machine translation as a data augmentation technique[33], or 2) cross-lingual models can capture the semantics of the target language, even when trained on another language(e.g., English). Finally, the FFNN classifier, which combined the sentence transformer and the cosine similarity method, proved to be the best option for both 3-class and 2-class sentiment classification tasks, achieving 62.0% and 82.2% accuracy, respectively.
Published	Novi Sad : ComSIS consortium
Type	Journal article
Language	English
Publication date	2023
CC license

„Deep learning-based sentiment classification in amharic using multi-lingual datasets /“