Proceedings of the Japan Academy, Series B
Online ISSN : 1349-2896
Print ISSN : 0386-2208
ISSN-L : 0386-2208
An outline of an informatical method for identifying the complete set of genes using the DNA sequence of a whole genome
Masashi SUZUKI
Author information
JOURNAL FREE ACCESS

1999 Volume 75 Issue 4 Pages 81-86

Details
Abstract

The identification of open reading frames (ORFs) by using the DNA sequence of a whole genome involves a statistical process to separate candidates-i.e. sections that start with formal start colons and end with formal termination colons, into two groups, authentic ORFs and artifacts. A small number of genes known prior to the study can be used for the analysis of general informatical characteristics that are expected to be shared by all the ORFs present in the genome. The results can be summarized into the form of scoring systems that measure the relatedness of each candidate to the model ORE In order to identify the complete set of ORFs the rate of false negative identification needs to be minimized, so that no important ORE is missed. A number of non-ORE sections can be analyzed by the same systems in order to estimate the rate of false positive identification. This rate can be systematically reduced by combining multiple scoring systems that evaluate different ORE-specific characteristics.

Content from these authors
© The Japan Academy
Previous article Next article
feedback
Top