Title Hibridinis sukčiavimo SMS žinučių aptikimo metodas
Translation of Title A hybrid method for detecting fraudulent SMS messages.
Authors Rutkauskas, Ignas
Full Text Download
Pages 75
Keywords [eng] smishing ; phishing via SMS ; hybrid method ; machine learning ; cybersecurity
Abstract [eng] This master’s thesis examines the problem of fraudulent SMS message detection and presents a hybrid classification method adapted for Lithuanian-language text messages. The relevance of the research is determined by the rapidly increasing number of SMS fraud cases and the limited availability of automated detection solutions designed specifically for the Lithuanian language. Fraudulent SMS messages pose a significant threat to users’ information security; therefore, their effective detection has become an important cybersecurity task. The aim of the thesis is to develop an efficient and practically applicable method capable of automatically identifying fraudulent SMS messages in Lithuanian. During the research, a review of scientific literature was conducted, analyzing the most commonly used SMS fraud detection methods, as well as their advantages and limitations. In addition, the main characteristics of fraudulent messages were examined, including the use of links, urgency creation, deceptive keywords, and atypical text structures. A hybrid method combining rule-based analysis and a machine learning model was developed in this work. The machine learning component was based on a logistic regression classifier using TF-IDF text representation and character n-gram analysis. The rule-based method relied on the identification and weighted evaluation of specific fraud-related indicators. The final classification decision was obtained by combining the outputs of both methods. During the experimental evaluation, several machine learning models were compared, including Logistic Regression, Naive Bayes, Random Forest, Extra Trees, and Bi-LSTM. Among the individual models, Logistic Regression achieved the best results with an F1-score of 89.32%. Meanwhile, the proposed hybrid method achieved 92% accuracy, 93.75% precision, 90% recall, and a 91.84% F1 score. Compared to the method based solely on the machine learning model, the hybrid approach reduced the number of false positive classifications, demonstrating higher reliability for practical applications. The study also examined the impact of classification thresholds on model performance and evaluated the sensitivity of different methods to false positive and false negative classifications. The results showed that the hybrid method provides a better balance between fraudulent message detection and the protection of legitimate messages from incorrect classification, making it suitable for practical SMS filtering systems. The obtained results demonstrated that integrating rule-based analysis improves overall classification quality and reduces the number of false alerts. The developed method is adapted to the characteristics of the Lithuanian language and can serve as a basis for the development of practical SMS filtering systems intended to protect users from fraudulent messages.
Dissertation Institution Kauno technologijos universitetas.
Type Master thesis
Language Lithuanian
Publication date 2026