Document Detail


Detection of generic spaced motifs using submotif pattern mining.
MedLine Citation:
PMID:  17483509     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
MOTIVATION: Identification of motifs is one of the critical stages in studying the regulatory interactions of genes. Motifs can have complicated patterns. In particular, spaced motifs, an important class of motifs, consist of several short segments separated by spacers of different lengths. Locating spaced motifs is not trivial. Existing motif-finding algorithms are either designed for monad motifs (short contiguous patterns with some mismatches) or have assumptions on the spacer lengths or can only handle at most two segments. An effective motif finder for generic spaced motifs is highly desirable. RESULTS: This article proposes a novel approach for identifying spaced motifs with any number of spacers of different lengths. We introduce the notion of submotifs to capture the segments in the spaced motif and formulate the motif-finding problem as a frequent submotif mining problem. We provide an algorithm called SPACE to solve the problem. Based on experiments on real biological datasets, synthetic datasets and the motif assessment benchmarks by Tompa et al., we show that our algorithm performs better than existing tools for spaced motifs with improvements in both sensitivity and specificity and for monads, SPACE performs as good as other tools. AVAILABILITY: The source code is available upon request from the authors.
Authors:
Edward Wijaya; Kanagasabai Rajaraman; Siu-Ming Yiu; Wing-Kin Sung
Related Documents :
11709939 - Space maintenance in the primary and mixed dentition.
6694129 - A follow-up of some north-east london trainees.
20055229 - Battista grassi entomologist and the roman school of malariology.
3869759 - The role of voluntary village aides in the control of malaria by presumptive treatment ...
12630409 - Vector-control synergies, between 'roll back malaria' and the global programme to elimi...
8216139 - Communication between space crews and ground personnel: a survey of astronauts and cosm...
7591349 - The use of a structured assessment interview as an intervention to reduce dropout rates...
8935889 - Investigation of the validity of facilitated communication through the disclosure of un...
2091449 - A method for transcribing signed and spoken language.
Publication Detail:
Type:  Comparative Study; Journal Article; Research Support, Non-U.S. Gov't     Date:  2007-05-05
Journal Detail:
Title:  Bioinformatics (Oxford, England)     Volume:  23     ISSN:  1367-4811     ISO Abbreviation:  Bioinformatics     Publication Date:  2007 Jun 
Date Detail:
Created Date:  2007-07-09     Completed Date:  2007-08-23     Revised Date:  2009-11-04    
Medline Journal Info:
Nlm Unique ID:  9808944     Medline TA:  Bioinformatics     Country:  England    
Other Details:
Languages:  eng     Pagination:  1476-85     Citation Subset:  IM    
Affiliation:
Institute for Infocomm Research, Singapore.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Algorithms
Amino Acid Motifs*
Base Sequence
Binding Sites
Computational Biology / methods*
Models, Statistical
Pattern Recognition, Automated*
Protein Binding
Protein Structure, Tertiary
Sequence Alignment
Sequence Analysis, DNA
Transcription Factors / genetics
Chemical
Reg. No./Substance:
0/Transcription Factors

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Phenotypic clustering of yeast mutants based on kinetochore microtubule dynamics.
Next Document:  RNABindR: a server for analyzing and predicting RNA-binding sites in proteins.