Volume 29 (2012) Issue 2 Pages 123-130
In order to upgrade the genome sequence information of J. curcas L., we integrated de novo assembly of a total of 537 million paired-end reads generated from the Illumina sequencing platform into the current genome assembly which was obtained by a combination of the conventional Sanger method and the Roche/454 sequencing platform. The total length of the upgraded genome sequences thus obtained was 297,661,187 bp consisting of 39,277 contigs. The average and N50 lengths of the generated contigs were 7,579 bp and 15,950 bp, both of which were increased fourfold from the previous genome assembly. Along with genome sequence upgrading, the currently available transcriptome data were collected from the public databases and assembled into 19,454 tentative consensus sequences. Based on a comparison between these tentative consensus sequences of transcripts and the predictions of computer programs, a total of 30,203 complete and partial structures of protein-encoding genes were deduced. The number of genes with complete structures was increased about threefold from the previous genome annotation. By applying the upgraded genome sequence and predicted protein-coding gene information, the number and features of the tandemly arrayed genes, syntenic relations between Jatropha and other plant genomes, and structural features of transposable elements were investigated. The detailed information on the updated J. curcas genome is available at http://www.kazusa.or.jp/jatropha/.