Abstract [eng] |
In the first part of the paper, the methods used in the scientific literature to assess the risk of fraud are analyzed and the main factors determining the risk of fraud are elaborated. The second part of the paper presents the collected publicly available data on the loans sold by UAB "Bendras finansavimas" on the secondary market. The data sample management and fraud risk assessment methods used in the study are defined: Random Forests, Support Vector Machines, Artificial Neural Networks. In the third part of the paper, the data are processed using principal component analysis. Models of Random Forests, Support Vector Machines and Artificial Neural Networks were created with three different data samples: Primary dataset (without implemented principal component analysis); a sample with eight extracted principal components; sample with eight principal components and qualitative research variables. After conducting the research, it was found that the Random Forest model with the dataset of eight principal components and qualitative variables most accurately classified the object of study - the fact of fraud. The main factors determining fraud are: bad credit history, debt service-to-income ratio, the amount of the loan, the client's assessment of the risk of fraud and age. |