Title Evolution of nucleotide sequences over passing time /
Authors Jablonskaitė, Kamilija ; Ruzgas, Tomas
DOI 10.15388/DAMSS.13.2022
ISBN 9786090707944
eISBN 9786090707951
Full Text Download
Is Part of DAMSS 2022: 13th conference on data analysis methods for software systems, Druskininkai, Lithuania, December 1–3, 2022 / Lithuanian computer society, Vilnius university Institute of data science and digital technologies, Lithuanian academy of sciences.. Vilnius : Vilnius university press, 2022. p. 34-35.. ISBN 9786090707944. eISBN 9786090707951
Abstract [eng] All DNA sequences contain four types of nucleotides, which in turn hold all genetic information inherited by an organism. However, DNA can mutate while replicating itself, which means that it is possible to lose a number of nucleotides and/or to gain different fragments of the original sequence; in other words, the initial DNA sequence can differ from its duplicate. Knowing this, DNA sequence over passing time can be depicted as a discrete-time homogeneous Markov chain, while sequence evolution in space can be described as an action which depicts new element addition to the sequence. Theoretically, evolution in space simulates DNA sequence formation. In the stationary case, the distribution of a random sequence does not depend on the fixed time moment. It is hard to find any data regarding DNA sequence evolution in space – usually, only one sequence can be found. It is possible to reconstruct the transition matrix or the properties of that matrix from the stationary distribution of the Markov chain during the evolution over passing the time, yet this problem is ill-posed. In general, said the problem has a lot of solutions which could be found only by using some additional assumptions and regularisation methods. However, the solution could be found more easily using the local balance equation if the DNA sequence is reversed and the transition matrix only depends on a relatively small number of unknown parameters. The mathematical model of the described genetic sequence should not disagree with already known facts of genetic science and lean on these biological assumptions: • Introns (non-coding DNA fragments) do not directly influence the possible survival of an individual or species in general. This means the regularities of inanimate nature have more of an influence than natural selection. The fragments in question are not as important as fragments in the sequence that hold information, so it would be appropriate to search non- informative sequences in introns. • Intron evolution over time has a simple structure, and evolution itself is only affected by local random factors. For example, it is assumed that the insertion or deletion of DNA fragment does not exist and nucleotides simply swap with each other with the probability that only depends on their nearest neighbours (simple local evolution). While this assumption is intuitive, it is not necessarily true as it has not been fully proved. • It is assumed that the process of evolution over time of the DNA sequence in question is stable. If the said process is not stable, any part of the DNA sequence can become informative. Stationary distribution of nucleotides residing in introns is assumed as non-informative in the case of simple local evolution.
Published Vilnius : Vilnius university press, 2022
Type Conference paper
Language English
Publication date 2022
CC license CC license description