Balso suspaudimo algoritmų realizavimo įterptinėje sistemoje galimybių tyrimas

Ričardas Garmus

Title	Balso suspaudimo algoritmų realizavimo įterptinėje sistemoje galimybių tyrimas
Translation of Title	The feasibility study of the implementation of a vocoders in an embedded system.
Authors	Garmus, Ričardas
Full Text
Pages	45
Keywords [eng]	speech signal processing ; speech coding ; quality estimation ; embedded systems
Abstract [eng]	The final master's project investigates the possibilities of implementing high compression voice coding algorithms (CELP, MELP, MELPe, Codec2) in embedded systems, evaluating the possibilities to work in real time. The work reviews the structure and operating principles of voice compression methods. Using high compression vocoder libraries, a model has been developed in the GNURadio environment where algorithms can be tested in real time. The selected Codec2 1200 bps voice compression algorithm is adapted to the lithuanian language, using special LIEPA language recording libraries. Subjective evaluations of speech quality and intelligibility are performed to compare voice compression algorithms using lithuanian audio recordings. The speed and signal latency of high compression voice coding algorithms are evaluated using three different processors of embedded systems: medium-speed ARM Cortex-M4 180 MHz and ARM Cortex-M7 and high-speed ARM Cortex-A8 1 GHz processor. Based on subjective evaluations, the best speech quality is obtained using the MELPe 1200 bps voice compression algorithm (3.62 ± 0.99 points), the worst - Codec2 1200 bps (2.98 ± 1.12 points). The best speech intelligibility is obtained by using Codec2 3200 bps and MELPe 1200 bps voice compression algorithms (3.6 ± 0.93 and 3.58 ± 0.93 points, respectively). Objective PESQ estimates are up to 1 point lower than the subjective method. Codec2 1200 bps voice compression algorithm trained using lithuanian audio recordings is better evaluated by both subjective and objective tests. The obtained results show that all voice compression algorithms evaluated in the work can work in real time in each embedded system, except for MELPe, whose signal delay in all systems exceeds the recommended limit of 150 ms of the ITU-T G.114 standard. Encoding on all processors is fastest using Codec2 voice compression algorithms, and the slowest using MELPe. Decoding is fastest using the CELP compression method.
Dissertation Institution	Kauno technologijos universitetas.
Type	Master thesis
Language	Lithuanian
Publication date	2020

„Balso suspaudimo algoritmų realizavimo įterptinėje sistemoje galimybių tyrimas“