Title Mašininio vertimo ir didžiųjų kalbos modelių našumas verčiant informacinių technologijų srities tekstą iš anglų kalbos į lietuvių kalbą /
Translation of Title Performance of machine translation and large language models in translating information technology text from english to lithuanian.
Authors Kubiliūtė, Vilija
Full Text Download
Pages 154
Keywords [eng] artificial intelligence ; machine translation ; large language models ; BLEU ; MQM
Abstract [eng] The title of the Master‘s thesis is Performance of Machine Translation and Large Language Models in Translating Information Technology Text from English to Lithuanian. The relevance of the analysis is based on the abundance of academic works by foreign scientists and the scarcity of such works by Lithuanian scientists (Milisevičiūtė, 2009; Petkevičiūtė, Tamulynas, 2011; Miltakienė, 2021; Valančiauskienė, 2023). The novelty of the thesis is reflected in the topic and research direction, which focuses on the rapid development of artificial intelligence technologies, the contemporary changing translation paradigm and the changing needs and habits of digital text users. The demand for MV is constantly growing and the current translation methods and software are still not perfect. Moreover, although large language models are experiencing a breakthrough, there is no research in this area in Lithuania. The object of the thesis is the translation capabilities of the MV system DeepL and the ChatGPT chatbot. The aim of the thesis is to carry out a comparative analysis of the translation of an MV system and a chatbot as a translation tool. The objectives of the thesis: 1. to analyse and review the scientific literature on machine translation and automatic translation functionalities of generative artificial intelligence tools; 2. to analyse the translations generated by the machine translation system DeepL and the chatbot ChatGPT on the basis of MV quality assessment methods; 3. to evaluate and compare the quality of translations generated by different translation tools according to accuracy and fluency criteria. After carrying out the research, the following conclusions can be drawn: 1. The review of the scientific literature on the chosen topic reveals that artificial intelligence plays a significant role in today's digital society.With the development of language technologies and the increasing capacity of computers, the quality of the translations generated by automatic translation systems is improving. With the advent of new natural language processing methods, the quality of machine translation has moved considerably closer to that of human translation. The development of Native Language Technologies is driven by access to as many relevant linguistic data and resources as possible. 2. The analysis of the results of the translation quality assessment shows that the higher performance of the neural MV system DeepL was confirmed by both methods of automatic translation quality assessment. The better translation quality of the MV system DeepL was observed both in the automatic quality assessment system BLEU and in the manual multidimensional quality assessment system MQM. 3. The analysis of the translation errors that occurred in the translation process of all three machine translation systems revealed a certain error proneness. The MV system DeepL generated the highest number of correct sentences. The second most accurately generated sentences were generated by the ChatGPT-4 chatbot. The chatbot ChatGPT-3.5 generated the lowest number of correct sentences. Mainly terminological and inaccurate translation errors were found when assessing the performance of the systems according to the accuracy criteria. The neural machine translation system DeepL was the best performer in terms of fluency. The results of this particular research allow us to assume that, currently chatbot ChatGPT which is based on LLMs cannot yet be an alternative to the machine translation systems such as DeepL and others simila MT systems as a translation tool. However, looking through the prism of perspective, such a rapid development of technologies allows us to have expectations that in the near future both MV systems and LLMs will become an indispensable tool in the translator's work process. The thesis is structured as follows: introduction, theoretical review, methodology and empirical parts, conclusions, list of references and information sources, and appendices.
Dissertation Institution Kauno technologijos universitetas.
Type Master thesis
Language Lithuanian
Publication date 2024