Annotation:
Sequence assemblies of most plant genomes remain fragmented, leaving the aim of gapless, telomere-to-telomere (T2T) sequence assemblies unattained. Revealing the causes of sequence gaps in the current assemblies helps determine the resources needed to close them. We analysed sequence gaps in the current long-read reference genome sequence of barley cv. Morex (MorexV3) by the optical map and sequence raw data, complemented by ChIP-seq data for centromeric histone variant CENH3. Our estimates of the abundance of centromeric, ribosomal DNA, and subtelomeric repeats were compared with copy numbers in the MorexV3 pseudomolecule sequence. We found that almost all centromeric sequences and 45S ribosomal DNA repeat arrays were absent from the MorexV3 pseudomolecules and that the majority of sequence gaps can be attributed to assembly breakdown in long stretches of satellite repeats. We discuss the prospects of gap closure with ultra-long sequence reads.
Picture description:
Optical maps of barley chromosomes are a vital tool for its genome assembly.
Link