IPSJ Transactions on Bioinformatics
Online ISSN : 1882-6679
ISSN-L : 1882-6679
 
Detecting Fusion Genes in Long-Read Transcriptome Sequencing Data with FUGAREC
Keigo MasudaYoshiaki SotaHideo Matsuda
Author information
JOURNAL FREE ACCESS

2024 Volume 17 Pages 1-9

Details
Abstract

Fusion genes are important targets and biomarkers for cancer therapy. Methods of accurately detecting fusion genes are needed in clinical practice. RNA-seq is widely used to detect active fusion genes. Long-read RNA-seq can sequence the full length of mRNA, and long-read RNA-seq is expected to detect fusion genes that cannot be detected by short-read RNA-seq. However, long-read RNA-seq has high basecalling error rates, and gap sequences may occur near the breakpoints of long reads that are not aligned to the genome. When gap sequences occur, it is impossible to identify the correct fusion gene or breakpoint using existing methods. To address these challenges in fusion gene detection, we introduce a novel algorithm, FUGAREC (fusion detection with gap re-alignment and breakpoint clustering). FUGAREC uniquely combines gap sequence re-alignment with breakpoint clustering. This approach not only enhances the detection of previously undetectable fusion genes but also significantly reduces false positives. We demonstrate that FUGAREC has high fusion gene detection performance on both simulated data and sequenced data of a breast cancer cell line.

Content from these authors
© 2024 by the Information Processing Society of Japan
Next article
feedback
Top