Machine Learning Applications for Detection of Malicious Email Addresses and Fake Usernames

Edvin Saveljev

Title	Machine Learning Applications for Detection of Malicious Email Addresses and Fake Usernames
Translation of Title	Mašininio mokymosi taikymai kenkėjiškų elektroninių pašto adresų ir netikrų vartotojų vardų aptikimui.
Authors	Saveljev, Edvin
Full Text
Pages	85
Keywords [eng]	machine learning ; feature engineering ; data mining ; fraud prevention ; email address
Abstract [eng]	Malicious user identification in the digital world is a particularly relevant topic, as global digitization is driving not only the transition of consumers, transactions but also criminal activities from physical to digital environment. E-mails and usernames play a key role in the emerging digital ecosystem, as these are the aspects that are used for user identification and personalization. Based on these trends, the need to find new sources of information about the user to minimize malicious activity has been identified. In this work, the analysis of scientific literature was performed, the characteristics of the malicious user were identified, and feature engineering was performed based on these characteristics. Two data sets were collected for the study, corresponding to e-mails and usernames. Next, machine training using classification algorithms was performed to identify malicious users. Separate models were developed for each selected set of features to assess their significance. Classification models using all attributes have also been developed. The results revealed that the classification of emails may be more accurate than the classification of usernames. The obtained classification results can be applied as input to other methods of malicious user detection to increase the accuracy of identification.
Dissertation Institution	Kauno technologijos universitetas.
Type	Master thesis
Language	English
Publication date	2021

„Machine Learning Applications for Detection of Malicious Email Addresses and Fake Usernames“