| Comparison of protein coding gene content of yeast and other fungal genomes | |
Abstract/OtherAbstract
:
|
VTT Symposium 242. International Specialised Symposium on Yeasts ISSY25. Systems Biology of Yeasts- from Models to Applications, 57, Despite the extensive research the exact function many yeast genes remains unknown. Comparisons to other fungal genomes can add power to the genomic analysis by providing the evolutionary context of genes. Our goal is to compare the protein coding gene contents of fungal genomes and relate the differences to the physiological differences between species and taxonomic groups. We have produced consistent Interpro annotations and clustering of protein coding sequences of 16 sequenced fungal genomes of which 8 are yeasts. Our computational system is based on BioPerl scripts and BioSQL schema for storing the sequences and annotations in a relational database. The clustering of protein sequences is done with Tribe-MCL graph clustering software using distances based on Blast E-values. We have discovered that the number of genes belonging to protein clusters having members from all fungal species studied is negatively correlated with genome size, i.e. larger fungal genomes are likely to have more specialized functions not present in species with smaller genomes. In contrast, based on protein clustering Pezizomycotina and Saccharomycotina seem to differ in their level of paralogy, i.e. in number of duplicated genes. In Saccharomycotina average level of paralogy is positively correlated to the size of the genome. In Pezizomycotina, possibly due to Repeat Induced Point mutations (RIP), no clear correlation exists. Using Generic Genome Browser we have created a web-based system that allows the scientists at VTT to easily utilize comparative information in their work. In particular, we link the Interpro entries and clusters so that a user can for a particular protein family browse the neighborhood of the family members detected by Interpro to assess the true extend of the family. In addition, a user can easily find the clusters and Interpro entries that have interesting species distributions. We also link the S. cerevisiae metabolic model iND750 to the comparative data., Despite the extensive research the exact function many yeast genes remains unknown. Comparisons to other fungal genomes can add power to the genomic analysis by providing the evolutionary context of genes. Our goal is to compare the protein coding gene contents of fungal genomes and relate the differences to the physiological differences between species and taxonomic groups. We have produced consistent Interpro annotations and clustering of protein coding sequences of 16 sequenced fungal genomes of which 8 are yeasts. Our computational system is based on BioPerl scripts and BioSQL schema for storing the sequences and annotations in a relational database. The clustering of protein sequences is done with Tribe-MCL graph clustering software using distances based on Blast E-values. We have discovered that the number of genes belonging to protein clusters having members from all fungal species studied is negatively correlated with genome size, i.e. larger fungal genomes are likely to have more specialized functions not present in species with smaller genomes. In contrast, based on protein clustering Pezizomycotina and Saccharomycotina seem to differ in their level of paralogy, i.e. in number of duplicated genes. In Saccharomycotina average level of paralogy is positively correlated to the size of the genome. In Pezizomycotina, possibly due to Repeat Induced Point mutations (RIP), no clear correlation exists. Using Generic Genome Browser we have created a web-based system that allows the scientists at VTT to easily utilize comparative information in their work. In particular, we link the Interpro entries and clusters so that a user can for a particular protein family browse the neighborhood of the family members detected by Interpro to assess the true extend of the family. In addition, a user can easily find the clusters and Interpro entries that have interesting species distributions. We also link the S. cerevisiae metabolic model iND750 to the comparative data. |
Authors
:
|
Arvas, Mikko, Kivioja, Teemu, Mitchell, A., Saloheimo, Markku, Oliver, S., Penttilä, Merja |
Related Documents
:
|
0034749453 - Monitoring of transcript regulation and protein production related
stress responses in ... 0001190463 - Relationships between c-, l- an p-band sar data and forest damages |
Contributors
:
|
- |
Publication Detail
:
|
Publisher : VTT Type : text Format : application/pdf |
Date Detail
:
|
2006 |
Subject
:
|
- |
Coverage
:
|
- |
Relation
:
|
- |
Source
:
|
- |
Copyright Information
:
|
Copyright VTT Technical Research Centre of Finland. Full text may not be reproduced, republished, stored, distributed, transmitted, altered or resold except as follows: Full text may be downloaded, held and displayed for private study or research and single copies may be printed out for private study or research. In all citations the source and the copyright holder must be acknowledged. The documents are provided "as is" and "as available" basis. No warranty of kind, either express or implied including but not limited to warranties of title or non-infringement or implied warranties of merchantability or fitness for a particular purpose, is made in relation to the availability, accuracy, reliabi lity or content of these pages. |
Other Details
:
|
Languages : eng |
Export Citation
:
|
APA/MLA Format Download EndNote Download BibTex |
Previous Document: lfsle32c_1TR
Next Document: Syndromic Surveillance Practice in the United States: Findings from a Survey of State, Territorial, ...