Summary
Resource Type
Genome Assembly
Organism
Name
Whole Genome Assembly (v4.0) and Annotation of Zea mays - cv. B73 (EnsemblPlant/Gramene)
Program, Pipeline, Workflow or Method Name
Gene annotation was performed in the laboratory of Doreen Ware (CSHL/USDA)
Program Version
v 4.0
Algorithm
Date Performed
Saturday, February 4, 2017
Data Source
Source Name
: Gramene
Source Version
: RefGen_v4
Source URI
: http://ensembl.gramene.org/Zea_mays/Info/Annotation/

The gene models of Maize RefGen_V4 were named following the standard of Maize Genetics Nomenclature.

Genes were annotated with Maker pipeline (Campbell et al. 2014) using 111,000 transcripts obtained by single-molecule sequencing. These long read Iso-Seq data (Wang et al. 2016) improved annotation of alternative splicing, more than doubling the number of alternative transcripts from 1.5 to 3.8 per gene, thereby improving our knowledge of gene structure and transcript variation, resulting in substantial improvements including resolved gaps and misassembles, corrections to strand, consolidation of gene models, and anchoring of unanchored genes.

Source : Gramene

 


 

ASSEMBLY AGPv3 5b+ (cultivar B73) :

The complete genome sequence of Zea mays cv. B73 (RefGen_v1) was published in 2009 by the NSF-funded Maize Genome Sequencing Project (Schnable et al, 2009). The high-quality assembly was accomplished by a strategy of sequencing individual BAC clones along a minimum tiling path anchored to genetic and physical maps (Schnable et al, 2009;Wei et al, 2009].

This version of the assembly (RefGen_v3) incorporates additional contigs assembled from whole genome shotgun sequencing reads. These contigs were selected because they include portions of full length cDNAs that were not covered by the BAC based assembly. The contigs were inserted into gaps based on a synteny-refined genetic map. This genetic map was also used to rearrange some clones.

Genes were annotated using both an evidence-based approach (e.g. using cDNA and EST data) and an ab initio approach (FGENESH), which were combined to give a unique non-overlapping gene set. New and updated gene models are limited to the regions where new contigs were inserted.

 

ASSEMBLY AGPv4 (cultivar B73) :

This entirely new assembly of the maize genome (B73 RefGen_v4) is constructed from PacBio Single Molecule Real-Time (SMRT) sequencing at approximately 60-fold coverage and scaffolded with the aid of a high-resolution whole-genome restriction (optical) mapping. The pseudomolecules of maize B73 RefGen_v4 are assembled nearly end-to-end, representing a 52-fold improvement in average contig size relative to the previous reference (B73 RefGen_v3).

The gene models of Maize RefGen_V4 were named following the standard of Maize Genetics Nomenclature.

Genes were annotated with Maker pipeline (Campbell et al. 2014) using 111,000 transcripts obtained by single-molecule sequencing. These long read Iso-Seq data (Wang et al. 2016) improved annotation of alternative splicing, more than doubling the number of alternative transcripts from 1.5 to 3.8 per gene, thereby improving our knowledge of gene structure and transcript variation, resulting in substantial improvements including resolved gaps and misassembles, corrections to strand, consolidation of gene models, and anchoring of unanchored genes.

Source : Gramene
 


 
B73 GENOME ASSEMBLY HISTORY

November 2010
B73 RefGen_v2 was released as the default view of the assembly at MaizeGDB. This version was calculated by the Maize Genome Sequencing Consortium and became available via GenBank on December 7th, 2012. The project record is 10769.

April 2013
The next version, B73 RefGen_v3, became the default assembly view of the MaizeGDB Genome Browser in April 2013. RefGen_v3 was not a global re-assembly. B73 RefGen_v3 used Roche/454 reads produced from a whole genome shotgun (WGS) sequencing library to capture missing gene space within and between the original BACs. The 454 reads were assembled into contigs with AbySS and aligned to the B73 RefGen_v2 assembly to identify new contiguous pieces of DNA sequence that were already represented in the v2 assembly. In addition, ~65,000 Full Length cDNAs (FLcDNAs- from the Maize Full Length cDNA project) were aligned to both the B73 RefGen_v2 contigs and the new contigs. B73 RefGen_v3 was the final product of the Maize Genome Sequencing Consortium.

August 2016
An entirely new assembly of the maize genome (B73 RefGen_v4) was constructed from PacBio Single Molecule Real-Time (SMRT) sequencing at approximately 60 fold coverage and scaffolded with the aid of a high-resolution whole-genome restriction (optical) mapping. This new assembly was constructed without the assistance of the BAC physical map that had been used to guide the previous V1-V3 assemblies. The pseudomolecules of maize B73 RefGen_v4 were assembled nearly end-to-end, representing a 52-fold improvement in average contig size relative to the previous reference (B73 RefGen_v3). B73 RefGen_v4 was funded by the NSF IOS #1112127 award to Gramene.

 


 
ADDITIONAL GENOME ASSEMBLIES

Initially, the B73 genome was the only reference quality genome assembly available for maize due the high costs of sequencing and assembling a large (~2.1 GB) genome. More recently, as sequencing and assembling costs for a large genome have dropped, a number of maize research groups have constructed reference quality genome assemblies for some of the more widely used maize inbred lines.

Source : MaizeGDB

Publication
Jiao Y, Peluso P, Shi J, Liang T, Stitzer MC, Wang B, Campbell MS, Stein JC, Wei X, Chin CS, Guill K, Regulski M, Kumari S, Olson A, Gent J, Schneider KL, Wolfgruber TK, May MR, Springer NM, Antoniou E, McCombie WR, Presting GG, McMullen M, Ross-Ibarra J, Dawe RK, Hastie A, Rank DR, Ware D. Improved maize reference genome with single-molecule technologies.. Nature. 2017 06 22; 546(7659):524-527.