| Searching protein three-dimensional structures in faster than linear time. | |
| | |
MedLine Citation:
|
PMID: 20426692 Owner: NLM Status: MEDLINE |
Abstract/OtherAbstract:
|
Searching for similar structures from a three-dimensional (3-D) structure database of proteins is one of the most important problems in post-genomic computational biology. To compare two structures, we ordinarily use a measure called the root mean square deviation (RMSD) as the similarity measure. We consider a very fundamental problem of finding all the substructures whose RMSDs to the query are within some given threshold, from a 3-D structure database. The problem also appears in many other fields, such as computer vision and robotics. In this article, we propose the first algorithm that runs in faster than linear time on average. Our new algorithm runs in average-case O(m + N/m(1-epsilon)), where N is the database size, m is the query length, and epsilon is an arbitrary small constant such that 0 < epsilon < 1. It is a significant improvement over previous algorithms on the problem, considering that the best known worst-case time complexity of the problem is O(N log m), and the best known average-case (expected) time complexity of the problem was O(N). |
| | |
Authors:
|
Tetsuo Shibuya |
Publication Detail:
|
Type: Journal Article; Research Support, Non-U.S. Gov't |
Journal Detail:
|
Title: Journal of computational biology : a journal of computational molecular cell biology Volume: 17 ISSN: 1557-8666 ISO Abbreviation: J. Comput. Biol. Publication Date: 2010 Apr |
Date Detail:
|
Created Date: 2010-04-29 Completed Date: 2010-09-22 Revised Date: - |
Medline Journal Info:
|
Nlm Unique ID: 9433358 Medline TA: J Comput Biol Country: United States |
Other Details:
|
Languages: eng Pagination: 593-602 Citation Subset: IM |
Affiliation:
|
Human Genome Center, Institute of Medical Science, University of Tokyo, Tokyo, Japan. tshibuya@hgc.jp |
Export Citation:
|
APA/MLA Format Download EndNote Download BibTex |
| MeSH Terms | |
Descriptor/Qualifier:
|
Algorithms Amino Acid Sequence Computational Biology / methods* Models, Molecular* Proteins / chemistry* Time Factors |
| Chemical | |
Reg. No./Substance:
|
0/Proteins |
From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine
Previous Document: The power of detecting enriched patterns: an HMM approach.
Next Document: A parallel algorithm for error correction in high-throughput short-read data on CUDA-enabled graphic...