EURO-Online login
- New to EURO? Create an account
- I forgot my username and/or my password.
- Help with cookies
(important for IE8 users)
3932. Assesment of the quality of de novo assembled genome
Invited abstract in session TD-20: Advancements in AI and Genomics: Bridging Technology and Biology for Future Healthcare Solutions, stream Computational Biology, Bioinformatics and Medicine.
Tuesday, 14:30-16:00Room: 45 (building: 116)
Authors (first author is the speaker)
1. | Aleksandra Swiercz
|
Institute of Computing Science, Poznan University of Technology | |
2. | Alicja Dzik
|
Poznan University of Technology | |
3. | Piotr Lukasiak
|
Institute of Computing Science, Poznan University of Technology | |
4. | Jacek Blazewicz
|
Institute of Computing Science, Poznan University of Technology |
Abstract
The first draft of the human genome project was published more than two decades ago. It was incomplete, especially in the centromeric and telomeric regions, which are highly repetitive. With the development of third-generation technology, it has become possible to fill most of the gaps in the reference genome (T2T consortium). However, long-read technologies can cover large portions of the genome, but this involves higher error rates. Thus, de novo sequencing projects often use different sequencing technologies to, on the one hand, obtain long DNA fragments and, on the other hand, improve the sequence quality with short reads.
In the Genomic Map of Poland project, we used a pipeline to construct de novo a diploid human genome based on the trio: mother, father and child. We used several technologies in the pipeline: short reads, PacBio HIFI, Hi-C and ultra-long Nanopore. The resulting chromosome-wide scaffolds were compared to reference genomes (GRCH38 and CHM13). We then assessed the quality of our reference genome assembly by analyzing the consistency of k-mers appearing in the genome sequence and in short and long reads. It was also checked whether k-mers specific to only one of the parents (mother or father) occur in the copy of the chromosome inherited from one parent (each copy of the chromosome contains small differences in the sequence, specific to the individual). Various k-mer features indicate the high quality of the assembled genome.
Keywords
- Computational Biology, Bioinformatics and Medicine
Status: accepted
Back to the list of papers