Abstract [eng] |
Classification is widely applied on purpose to analyse various data. Nowadays, if the number of data and features is large in a dataset, classification problems become more difficult, thus the effective selection of significant features is a relevant task. The aim of master‘s work is to suggest a methodology and software tools which would let to select a smaller subset of original features but would not diminish classification quality measures significantly. Solving practical problems, an important factor is the time required for performance of classification algorithms in machine learning. In order to speed up the solution of classification problems, genetic algorithm, simulated annealing and recursive feature elimination methods are joined with support vector classifier. The results achieved by previous implementations are compared with interior features selection method of random forest using several classification quality metrics. The proposed methodology was implemented programmatically using R, Python and SAS programming languages. |