Document Detail


On the tractability of maximal strip recovery.
MedLine Citation:
PMID:  20632870     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
Given two genomic maps G and H represented by a sequence of n gene markers, a strip (syntenic block) is a sequence of distinct markers of length at least two which appear as subsequences in the input maps, either directly or in reversed and negated form. The problem Maximal Strip Recovery (MSR) is to find two subsequences G' and H' of G and H, respectively, such that the total length of disjoint strips in G' and H' is maximized (or, conversely, the number of markers hence deleted, is minimized). Previously, several heuristic algorithms which work well in practice, have been proposed. Theoretically, a factor-4 polynomial-time approximation is known for the MSR problem. Moreover, several close variants of MSR, MSR-d (with d > 2 input maps), MSR-DU (with marker duplications) and MSR-WT (with markers weighted) have been proved to be NP-complete. Before this work, the complexity of the original MSR problem was left open. In this article, we solve the open problem by showing that MSR is NP-complete, using a polynomial time reduction from One-in-Three 3SAT. We also present some fixed-parameter tractable algorithms for the (complement of) MSR problem and its variants. Let k be the minimum number of markers deleted in an optimal solution. The running time of our algorithms are O(2(3.61k)n + n(2)) for MSR, [Formula: see text] for MSR-d, and O(2(7.22k)n + n(2)) for MSR-DU, respectively.
Authors:
Lusheng Wang; Binhai Zhu
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't; Research Support, U.S. Gov't, Non-P.H.S.    
Journal Detail:
Title:  Journal of computational biology : a journal of computational molecular cell biology     Volume:  17     ISSN:  1557-8666     ISO Abbreviation:  J. Comput. Biol.     Publication Date:  2010 Jul 
Date Detail:
Created Date:  2010-07-16     Completed Date:  2010-10-18     Revised Date:  2011-01-24    
Medline Journal Info:
Nlm Unique ID:  9433358     Medline TA:  J Comput Biol     Country:  United States    
Other Details:
Languages:  eng     Pagination:  907-14     Citation Subset:  IM    
Affiliation:
Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Gene Rearrangement*
Genome
Genomics*
Haplotypes
Models, Genetic*
Comments/Corrections
Erratum In:
J Comput Biol. 2011 Jan;18(1):129

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Statistical issues in subpopulation analysis of high content imaging data.
Next Document:  K-partite RNA secondary structures.