Title Twenty years of machine-learning-based text classification: a systematic review /
Authors Palanivinayagam, Ashokkumar ; El-Bayeh, Claude Ziad ; Damaševičius, Robertas
DOI 10.3390/a16050236
Full Text Download
Is Part of Algorithms.. Basel : MDPI. 2023, vol. 16, iss. 5, art. no. 236, p. 1-28.. ISSN 1999-4893
Keywords [eng] machine learning ; natural language processing ; rating summarization ; sentiment analysis ; spam detection ; text classification
Abstract [eng] Machine-learning-based text classification is one of the leading research areas and has a wide range of applications, which include spam detection, hate speech identification, reviews, rating summarization, sentiment analysis, and topic modelling. Widely used machine-learning-based research differs in terms of the datasets, training methods, performance evaluation, and comparison methods used. In this paper, we surveyed 224 papers published between 2003 and 2022 that employed machine learning for text classification. The Preferred Reporting Items for Systematic Reviews (PRISMA) statement is used as the guidelines for the systematic review process. The comprehensive differences in the literature are analyzed in terms of six aspects: datasets, machine learning models, best accuracy, performance evaluation metrics, training and testing splitting methods, and comparisons among machine learning models. Furthermore, we highlight the limitations and research gaps in the literature. Although the research works included in the survey perform well in terms of text classification, improvement is required in many areas. We believe that this survey paper will be useful for researchers in the field of text classification.
Published Basel : MDPI
Type Journal article
Language English
Publication date 2023
CC license CC license description