Abstract [eng] |
In this master's thesis, automatic speech recognition system for Lithuanian speech corpus Liepa is created and investigated, using Kaldi speech recognition toolkit and deep neural networks. The operation of automatic speech recognition systems, application of deep neural networks in automatic speech recognition systems, functionality of the software package Kaldi, Lithuanian speech corpus Liepa and related research works are reviewed. The structure and methodological description of a hybrid automatic speech recognition system consisting of hidden Markov models, Gaussian mixture models, and deep neural networks are presented. The dependence of the accuracy of the model with deep neural networks on the parameters of the hidden Markov models and Gaussian mixture models is checked. 18 different neural network architectures consisting of combinations of time delay neural networks, long short-term memory neural networks, and bidirectional long short-term neural networks are tested. Optimization and cross-validation of training parameters of the selected neural network architecture is performed. The obtained results and conclusions are presented. |