## Alban MANCHERON et Irena RUSU :
Pattern discovery allowing gaps, substitution matrices and multiple
score functions.
*In* Gary BENSON et Roderic
PAGE, éditeurs : *Algorithms in Bioinformatics.
Proceedings of the 3*^{rd} International Workshop on Algorithms in
BioInformatics (WABI), volume 2812 de *Lecture Notes in
Bioinformatics (LNBI)*, pages 129-145. Springer-Verlag, 2003.

Pattern discovery has many applications in finding
functionally or structurally important regions in
biological sequences (binding sites, regulatory
sites, protein signatures etc.). In this paper we
present a new pattern discovery algorithm, which has
the following features:

- it
allows to find, in exactly the same manner and
without any prior specification, patterns with fixed
length gaps (i.e. sequences of one or several
consecutive wild-cards) and contiguous patterns;

- it allows the use of any pairwise
score function, thus offering multiple ways to
define or to constrain the type of the searched
patterns; in particular, one can use substitution
matrices (PAM, BLOSUM) to compare amino acids, or
exact matchings to compare nucleotides, or
equivalency sets in both cases.

We describe the algorithm, compare it to other
algorithms and give the results of the tests on
discovering binding sites for DNA-binding proteins
(ArgR, LexA, PurR, TyrR respectively) in
*E. coli*, and promoter sites in a set of
Dicot plants.

[ bib |
slides |
ps ]
Back