Document Detail

Heuristic Bayesian segmentation for discovery of coexpressed genes within genomic regions.
MedLine Citation:
PMID:  20150667     Owner:  NLM     Status:  MEDLINE    
Segmentation aims to separate homogeneous areas from the sequential data, and plays a central role in data mining. It has applications ranging from finance to molecular biology, where bioinformatics tasks such as genome data analysis are active application fields. In this paper, we present a novel application of segmentation in locating genomic regions with coexpressed genes. We aim at automated discovery of such regions without requirement for user-given parameters. In order to perform the segmentation within a reasonable time, we use heuristics. Most of the heuristic segmentation algorithms require some decision on the number of segments. This is usually accomplished by using asymptotic model selection methods like the Bayesian information criterion. Such methods are based on some simplification, which can limit their usage. In this paper, we propose a Bayesian model selection to choose the most proper result from heuristic segmentation. Our Bayesian model presents a simple prior for the segmentation solutions with various segment numbers and a modified Dirichlet prior for modeling multinomial data. We show with various artificial data sets in our benchmark system that our model selection criterion has the best overall performance. The application of our method in yeast cell-cycle gene expression data reveals potential active and passive regions of the genome.
Petri Pehkonen; Garry Wong; Petri T?r?nen
Related Documents :
20224127 - Texture synthesis with grouplets.
20426017 - Task versus subtask surgical skill evaluation of robotic minimally invasive surgery.
16685827 - A segmentation and reconstruction technique for 3d vascular structures.
19447707 - Hierarchical multiple markov chain model for unsupervised texture segmentation.
24265757 - Understanding long-term changes in species abundance using a niche-based approach.
10944357 - On the distribution of the bulk-solvent correction parameters.
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't    
Journal Detail:
Title:  IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM     Volume:  7     ISSN:  1557-9964     ISO Abbreviation:  IEEE/ACM Trans Comput Biol Bioinform     Publication Date:    2010 Jan-Mar
Date Detail:
Created Date:  2010-02-12     Completed Date:  2010-05-06     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  101196755     Medline TA:  IEEE/ACM Trans Comput Biol Bioinform     Country:  United States    
Other Details:
Languages:  eng     Pagination:  37-49     Citation Subset:  IM    
Department of Neurobiology, A.I. Virtanen Institute, University of Kuopio, Kuopio, Finland.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Base Sequence
Bayes Theorem
Chromosome Mapping / methods*
Genome / genetics*
Molecular Sequence Data
Multigene Family / genetics*
Pattern Recognition, Automated / methods*
Sequence Analysis, DNA / methods*

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Feature selection for gene expression using model-based entropy.
Next Document:  Data-fusion in clustering microarray data: balancing discovery and interpretability.