We generated a total of 237.7× sequence coverage of raw ultralong ONT data and 69.4× coverage of PacBio HiFi data for assembling the maize Mo17 genome (Fig. T2T assembly of all ten chromosomes of the maize Mo17 genome Here we report the complete telomere-to-telomere (T2T) Mo17 genome using a combination of ultralong Oxford Nanopore Technology (ONT) and PacBio HiFi reads, which marks a major step forward for genome assembly and uncovers the recalcitrant structural features of the highly complex maize genome. Yet, with the exception of B73-Ab10 with 53 gaps, all other assembled maize genomes reported still have hundreds or thousands of unfilled gaps. More recently, high-quality genome sequences of 25 nested association mapping (NAM) founder lines and a further improved B73 assembly (version 5) have also been reported 37. 35) and the inbreds underlying classical genetic studies W22 (ref. 34), the genetic transformation competent line A188 (ref. 30), the tropical lines SK 31 and K0326Y 32, European inbred lines F7, EP1, DK105 (ref. Due to the rich history of genetic studies in maize and its exceptional intraspecific genome diversity 26, 27, 28, 29, several additional inbred lines have also been sequenced, including the key Iodent line PH207 (ref. Therefore, the generation of their genome sequences is of great significance 8, 12, 13. The inbred lines B73 and Mo17 are the parental lines of one of the best-performing early commercial hybrids and of the most widely used bi-parental genetic mapping population 24, 25. Having a genome size very close to that of humans, while containing over 80% repetitive sequences, maize is also known as a model for complex genomes. Similarly, an essentially completed banana genome of 485 Mb has been reported 22, while the watermelon genome, with a size of 369 Mb and only several dozens of 45S rDNA copies, has been completely assembled 23. For the relatively compact Arabidopsis 18 and rice 19, 20, 21 genomes, high-quality gap-free reference genomes have also been achieved, with only several chromosomal ends, including nucleolus organizer regions (NORs), remaining incomplete. The assembly of the entire chr3 and chr9 of the maize line B73-Ab10 has also been reported 17. Benefitting from sequencing technology advances, the entire human X chromosome and chromosome 8 (chr8) have been completely assembled from telomere to telomere without gaps 14, 15, and a completely assembled human genome has recently been released 16. Using PacBio sequencing, both the improved B73 genome 12 and the Mo17 genome 13 achieved a contig N50 of more than 1 Mb, leaving only a few thousand gaps for this highly complex crop genome. The development of long-read, single-molecule DNA sequencing technologies substantially improved genome assembly quality 9, 10, 11. For instance, the first reported maize B73 inbred line genome had more than one hundred thousand gaps, with each tiling bacterial artificial chromosome sequence on average having more than ten gaps 8. Due to cost and read-length limitations, these earlier reported draft genomes typically had tens of thousands of gaps in the pseudochromosomes. All these pioneering genome sequencing projects were based on classical Sanger sequencing technology. 1), numerous draft genomes with varied extents of completeness have been reported, including fruit fly 2, Arabidopsis 3, human 4, 5, mouse 6 and the model crop species rice 7 and maize 8. Followed by the decoding of the very first eukaryotic genome, the 12 Mb yeast nuclear genome in 1996 (ref. Genome sequencing has been fundamental for the advancement of many aspects of basic biology as well as medical and agricultural applications. The complete Mo17 genome represents a major step forward in understanding the complexity of the highly recalcitrant repetitive regions of higher plant genomes. Additionally, complete assemblies of all ten centromeres enabled us to precisely dissect the repeat compositions of both CentC-rich and CentC-poor centromeres. The assembly of the entire nucleolar organizer region of the 26.8 Mb array with 2,974 45S rDNA copies revealed the enormously complex patterns of rDNA duplications and transposon insertions. There were several super-long simple-sequence-repeat arrays having consecutive thymine–adenine–guanine (TAG) tri-nucleotide repeats up to 235 kb. The 2,178.6 Mb T2T Mo17 genome with a base accuracy of over 99.99% unveiled the structural features of all repetitive regions of the genome. Through generating deep coverage ultralong Oxford Nanopore Technology (ONT) and PacBio HiFi reads, we report here a complete genome assembly of maize with each chromosome entirely traversed in a single contig. A complete telomere-to-telomere (T2T) finished genome has been the long pursuit of genomic research.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |