Thesis Details

Detection of repetitive sequences in genomes

Ph.D. Thesis Student: Puterová Janka Academic Year: 2021/2022 Supervisor: Zendulka Jaroslav, doc. Ing., CSc.
Czech title
Detekce repetitivních sekvencí v genomech

Repetitive sequences can make up a significant part of the genome, in some cases more than 80%, but scientists have often overlooked them. Today we know that repeats have various functions in the genomes and are divided into two main groups: interspersed and tandem repeats. This work aimed to develop bioinformatics tools to detect repetitive sequences, either directly from sequencing data generated by sequencers or assembled genomes. In the introductory part, the work provides an insight into the issue and an overview of the repeat types occurring in genomes. Furthermore, the work deals with existing approaches and tools with an aim to detect repeats directly from the assembled sequences. The main contribution to this area was developing the digIS tool, which aims to detect insertion sequences that represent the most abundant interspersed repeats in prokaryotes. digIS is based on the principle of profile hidden Markov models constructed for the catalytic domains of transposases, representing the most conserved part of the insertion sequences and retaining a secondary structure within the family. Subsequently, the work provides an overview of sequencing technologies and discusses existing methods for detecting repeats directly from sequencing data without the need for prior genome assembly. A novel approach for a detailed analysis of tandem repeats is presented. This approach extends the primary analysis of RepeatExplorer, which detects and characterizes repeats directly from sequencing data. The work further discusses the applications of repeat detection in biological research, especially from the point of view of comparative repeatome studies and the evolution of sex chromosomes. Finally, the work summarizes the research results in the form of four articles published in international journals, the full text of which is available in the appendices, and provides a general summary of the work together with possibilities for future research.


transposons, transposable elements, tandem repeats, satellite DNA, repetitive elements, repeatome, repeat detection, profile hidden Markov models, comparative analysis, sex chromosomes, genome evolution

Degree Programme
Back to top