| Abstract [eng] |
Relevance of the topic. Household consumption expenditure represents an important expression of economic behavior, reflecting not only the economic conditions of households but also their living standards, consumption priorities, and social inequalities. Since households differ in their social, demographic, and economic characteristics, substantial variation in consumption structures may be observed across different population groups. Existing studies mainly focus on the effects of individual determinants on consumption, while less attention is paid to data-driven identification of social groups and detailed analysis of consumption structures within such groups. Object of the study – households and their consumption expenditure structure. Aim of the study – to identify different social groups of households based on social, demographic, and economic household characteristics and to evaluate differences in their consumption expenditure structures. Methods – scientific literature review and synthesis; data preparation and feature selection methods; mixed-type data clustering techniques; cluster quality assessment methods; statistical and entropy-based approaches for comparing consumption structures; conclusion generation. To achieve the research aim, the study was divided into three main parts: literature review, methodological framework, and empirical analysis. The literature review examined the concept of social groups, the economic significance of household consumption expenditure, determinants of consumption differences, and the applicability of mixed-type data clustering methods for social group identification. The methodological section described data preparation procedures, feature selection techniques, the methodology for social group identification, and methods used for the analysis of consumption expenditure structures. The empirical section involved household data analysis, assessment of feature informativeness, optimization of Gower distance matrix weights, application of k-prototypes, k-medoids, and hierarchical clustering methods, as well as the use of statistical and entropy-based approaches to investigate differences in consumption structures. The findings demonstrated that six economically interpretable social groups could be identified using social, demographic, and economic characteristics from the 2021 Lithuanian Household Budget Survey data, applying the k-medoids PAM clustering algorithm together with an optimized Gower distance matrix. It was observed that removing redundant variables and optimizing feature weights improved clustering quality, increasing the average silhouette coefficient from 0.24 to 0.62. The analysis of household consumption expenditure revealed that social groups exhibited different patterns of internal expenditure distribution. Consequently, varying levels of internal uncertainty were identified across expenditure categories within groups (NH = 0.069–0.796); however, mutual information values remained low in most groups (mostly NMI < 0.18), indicating only weak intercategory relationships. |