Abstract [eng] |
Sentiment classification incustomer reviews is relevant for companies which seek to do research of its customer experience and satisfaction. Automatic sentiment classification empowers companies to track customer satisfaction in real time, take actions in advance and get competitive advantage. Thesis work considers and compares vector embedding methods: Paragraph Vector – Distributed Memory Model (PV-DBOW) with document vectors and both document and word vectors, Latent Semantic Indexation, Random Projections and Sent2Vec. Customer reviews as input datasets are taken from Imdb, TripAdvisor and Amazon. Machine learning and dictionary based algorithms have been applied for data classification: Logistic Regression, Random Forests and Multilayer perceptron, SentimentGI, SentimentHE, SentimentLM, SentimentQDAP and SenticNet4 dictionary based algorithms. Six best modelling results were combined and random forest classification was applied. New hybrid model has got the highest classification AUC scores in comparison with methods before join for all datasets. |