Abstract
The (l, d)-motif challenge problem, as introduced by Pevzner and Sze [12], is a mathematical abstraction of the DNA functional site discovery task. Here we expand the (l, d)-motif problem to more accurately model this task and present a novel algorithm to solve this extended problem. This algorithm is guaranteed to find all (l, d)-motifs in a set of input sequences with unbounded support and length. We demonstrate the performance of the algorithm on publicly available datasets and show that the algorithm deterministically enumerates the optimal motifs.