Genome Informatics
Online ISSN : 2185-842X
Print ISSN : 0919-9454
ISSN-L : 0919-9454
On Selecting Features from Splice Junctions
An Analysis Using Information Theoretic and Machine Learning Approaches
Christina L. ZhengVirginia R. De SaMichael GribskovT. Murlidharan Nair
Author information
JOURNAL FREE ACCESS

2003 Volume 14 Pages 73-83

Details
Abstract

The computational recognition of precise splice junctions is a challenge faced in the analysis of newly sequenced genomes. This is challenging due to the fact that the distribution of sequence patterns in these regions is not always distinct. Our objective is to understand the sequence signatures at the splice junctions, not simply to create an artificial recognition system. We use a combination of a neural network based calliper randomization approach and an information theoretic based feature selection approach for this purpose. This has been done in an effort to understand regions that harbor information content and to extract features relevant for the prediction of splice junctions. The analysis using the neural network based calliper randomization approach revealed regions important in the internal representation of the network model. The calliper approach captured both correlated as well as independently important features. The feature selection approach captures features that are independently informative. The two different methods can capture features with different properties. Comparative analysis of the results using both the methods help to infer about the kind of information present in the region.

Content from these authors
© Japanese Society for Bioinformatics
Previous article Next article
feedback
Top