Title Sintetinės ir manipuliuotos kalbos atpažinimo garso įrašuose tyrimas
Translation of Title Research of detection of synthetic and manipulated speech in audio recordings.
Authors Butnorius, Modestas
Full Text Download
Pages 95
Keywords [eng] spoof detection ; speech generation ; adversarial attacks ; noise reduction ; voice activity detection
Abstract [eng] This thesis investigates technologies for generating and detecting spoofed voice recordings. During the experiments, various speech generation methods were investigated, and the recordings they produced were evaluated using an existing real-world audio spoofing detection solution. Attention was dedicated to adversarial attacks, which have recently become a highly relevant problem in the field of audio spoofing detection. Several adversarial noise generation techniques and different noise reduction strategies were evaluated during the research. In addition, speech activity detection methods designed to extract speech segments from audio recordings were analyzed. Finally, various audio classification architectures were investigated, including CNN, RNN, and transformer-based models. Based on the results of the conducted experiments, a combined audio spoofing detection solution was developed, consisting of three main components: noise reduction, speech segment extraction, and a classification algorithm. The classification component employs an ensemble approach combining ten different artificial intelligence models trained using different audio features, whose predictions are merged into a single final decision. The solution proposed in this thesis is capable of effectively detecting spoofed audio recordings even in the presence of adversarial noise and achieves the best results according to the DCF metric when compared with other solutions presented in a challenge specifically designed for this task.
Dissertation Institution Kauno technologijos universitetas.
Type Master thesis
Language Lithuanian
Publication date 2026