Summary
Name
Whole Genome Assembly and Annotation (v2.1) of Sorghum bicolor RTx430
Description

Assembly

The Sorghum bicolor Tx430 genome was generated by combining Oxford Nanopore sequences generated on a MinION sequencer with Bionano Genomics Direct Label and Stain (DLS) optical maps, as described by Deschamps et al (Deschamps et al. 2018)(see also https://www.corteva.com). The final chromosome-scale de novo assembly consists of 29 scaffolds, encompassing mostly entire chromosome arms. It has a scaffold N50 of 33.28 Mbps and covers 90% of the expected genome length.

NCBI accession: https://www.ncbi.nlm.nih.gov/assembly/GCA_003482435.1

Annotation

Genome annotation was carried out as described by (Deschamps et al. 2018). First, genome repeats were masked using RepeatMasker and a curated sorghum specific repeats file from Repbase. The repeat-masked genome was used as input to two categories of gene predictors. De novo gene prediction programs Fgenesh (Solovyev et al, 1994), Augustus (Stanke and Waack, 2003), and SNAP (Bromberg and Rost, 2007) were run under default parameters and the training sets used were monocots, maize, and rice, respectively. The EST, cDNA, long-read evidence-based gene structure modelers GMAP and PASA, as well as the protein evidence-based gene structure modeler SPLAN were also run. Long read sequences of BTx623 line of sorghum from NCBI, along with other sorghum EST’s and cDNA were used as the evidence set to PASA. Other non-sorghum Poales EST, cDNA sequences from NCBI, and monocot transcripts from phytozome were used as additional closely related species evidence for gene prediction with GMAP. Uniref100 plant protein sequences were used as an evidence dataset for gene structure prediction using SPLAN. All gene annotation files were run through EvidenceModeler and the output used to polish the gene boundaries in PASA. The final PASA annotation file was combined with tRNA predictions file from tRNA-ScanSE to obtain the final structural annotation file, along with fasta sequences of protein, CDS, cDNA and gene. For additional details, see (Deschamps et al, 2018).

Source : SorghumBase 

 

Program, Pipeline, Workflow or Method Name
PASA
Program Version
n/a
Algorithm
Date Performed
Monday, February 13, 2023
Data Source
Source Name
: Phyzotome
Source Version
: 2.1
Source URI
: https://phytozome-next.jgi.doe.gov/info/SbicolorRTx430_v2_1
Publication
There are no publications associated with this record.
Annotations
This record has the following annotations.
TermNameDefinition
There are no annotations of this type
Relationship
There are 0 relationships.
Relationship
There are no relationships
Loading content