2025 Volume 18 Pages 39-50
Small open reading frames (smORFs), which encode peptides shorter than 100 amino acids, are increasingly recognized as important elements in bacterial genomes. However, their annotation remains limited due to challenges such as short sequence length, low conservation, and frequent overlap with known coding regions. This narrative review outlines current knowledge on smORF classification, genomic context, and proposed biological functions. We discuss key challenges in smORF identification and highlight recent experimental and computational strategies, including ribosome profiling, transcriptomics, proteomics, and machine learning (ML). Explainable artificial intelligence (XAI) methods are introduced for their potential to enhance model interpretability and support biological validation. A comparative overview of recent tools is provided, summarizing input data types, algorithmic strategies, and validation approaches. Finally, we highlight key limitations in current methodologies and propose integrative, interpretable AI-driven frameworks and standardized benchmarking approaches to advance bacterial smORF research.