Abstract
The DNA Data Bank of Japan (DDBJ) released DDBJ Sequence Read Archive (DRA), an archive database of raw data for New Generation Sequencing (NGS) reads.
The huge and short read data generated by the sequencers suffer biologists to require computational hardware and bioinformatics skills. DDBJ has been started the service of a cloud computing based analytical pipeline for high-throughput annotation for NGS reads named as DDBJ Read Annotation Pipeline (http://p.ddbj.nig.ac.jp/).
The features of the pipeline are as follows.
1) Application tools for various NGS platforms (illumina, Roche/454 and Life technology) are available.
2) Fundamental statistics such as quality scores or sequence depth is calculated via output files with uniform format.
3) DDBJ Supercomputer is utilized with remote control via the pipeline web server under scalable cloud computing manner.
In the annotation stage of the pipeline, various functions of piled sequencing reads (detecting SNPs and indels, counting gene expression levels, and so on), genome-wide visualization of alignment results are introduced.