Document Detail

Update of KEYnet: a gene and protein names database for biosequences functional organisation.
Jump to Full Text
MedLine Citation:
PMID:  10592277     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
KEYnet is a database where gene and protein names are hierarchically structured. Particular care has been devoted to the search and organisation of synonyms. The structuring is based on biological criteria in order to assist the user in data search and to minimise the risk of information loss. Links to the EMBL data library by the entry name and the accession number are implemented. KEYnet is available through the WWW at the following site: http://www.ba.cnr.it/keynet.html
Authors:
D Catalano; F Licciulli; D D'Elia; M Attimonelli
Related Documents :
16595557 - Nq-flipper: validation and correction of asparagine/glutamine amide rotamers in protein...
17945017 - The protein identifier cross-referencing (picr) service: reconciling protein identifier...
18629297 - Protein name tagging guidelines: lessons learned.
21059677 - Cpla 1.0: an integrated database of protein lysine acetylation.
22564897 - Kobamin: a knowledge-based minimization web server for protein structure refinement.
23416357 - Thermally responsive silicon nanowire arrays for native/denatured-protein separation.
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't    
Journal Detail:
Title:  Nucleic acids research     Volume:  28     ISSN:  0305-1048     ISO Abbreviation:  Nucleic Acids Res.     Publication Date:  2000 Jan 
Date Detail:
Created Date:  2000-02-25     Completed Date:  2000-02-25     Revised Date:  2009-11-18    
Medline Journal Info:
Nlm Unique ID:  0411011     Medline TA:  Nucleic Acids Res     Country:  ENGLAND    
Other Details:
Languages:  eng     Pagination:  372-3     Citation Subset:  IM    
Affiliation:
Area di Ricerca, CNR, 70126 Bari, Italy.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Databases, Factual*
Information Storage and Retrieval
Internet
Proteins / chemistry,  genetics*
Chemical
Reg. No./Substance:
0/Proteins
Comments/Corrections

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Full Text
Journal Information
Journal ID (nlm-ta): Nucleic Acids Res
ISSN: 0305-1048
ISSN: 1362-4962
Publisher: Oxford University Press, Oxford, UK
Article Information
Download PDF
Copyright © 2000 Oxford University Press
Received Day: 4 Month: 10 Year: 1999
Revision Received Day: 13 Month: 10 Year: 1999
Accepted Day: 13 Month: 10 Year: 1999
Print publication date: Day: 1 Month: 1 Year: 2000
Volume: 28 Issue: 1
First Page: 372 Last Page: 373
ID: 102452
Publisher Id: gkd070
PubMed Id: 10592277

Update of KEYnet: a gene and protein names database for biosequences functional organisation
D. Catalano
F. Licciulli
D. D’Elia
M. Attimonelli1a
Area di Ricerca, CNR, 70126 Bari, Italy and 1Department of Biochemistry and Molecular Biology, Faculty of Sciences, University of Bari, 70126 Bari, Italy
aTo whom correspondence should be addressed. Tel: +39 080 548 2130; Fax: +39 080 548 4467; Email: marcella@area.ba.cnr.it

INTRODUCTION

The most common interrogation criteria for bio-databases are gene and protein names but, so far, the majority of them have been incorrectly annotated in the nucleic acid sequence databases which causes inconsistencies in data retrieval. In order to properly target retrieval using such criteria, gene and protein names need to be correctly coded. Here we present the database KEYnet (1,2) where gene and protein names are organised in a hierarchical structure according to the biological function of the associated sequence. Links among lexical or biological synonyms are implemented.


DATABASE DESCRIPTION

Each entry in the KEYnet database is related to a gene or protein name. The whole database is hierarchically structured according to the scheme previously reported (1,2) and visible at http://bio-www.ba.cnr.it:8000/Tutorials/KEYnet/network.html . In particular, KEYnet structure is made up of a set of elements, nodes, linked to form a father–son relationship. At the highest level there is the root which links all the branches in the tree. The most important branches are the nodes Protein, DNA and RNA. Each leaf in the tree is composed of several elements linked by synonymy. Two by-side branches are implemented: the RAT Gene Names Tree and the Mitochondrial Genome Tree [the Mitochondrion Gene names classification has been structured as a contribution to the MitBASE project (3)]. Gene and protein names are extracted from the EMBL data library (4).

Biological information about associated sequences are extracted from the same primary databases [EMBL data library (4) and GenBank (5)] and from specialised databases such as SWISS-PROT (6), ENZYME (7) or any other suitable database. MEDLINE is also consulted whenever the above mentioned databases do not contain the necessary information for the gene and protein name classification. KEYnet database is updated at each EMBL data library release and, at this time, the link among KEYnet and the EMBL data library is established.

One of the major problems encountered during data classification is the gene names branch. Gene naming is recognised worldwide as a difficult problem, due to the freedom with which users assign a name to a gene whenever it is discovered. Several attempts to address this problem are in progress (8,9; see http://www.ebi.ac.uk:7081/docs/nomenclature and http://www.gene.ucl.ac.uk/nomenclature ).

We have organised gene names by establishing a starting set of main ancestor keywords relevant to their primary biological functions. At present KEYnet contains 66 219 gene and protein names as is reported in detail in the table at http://bio-www. ba.cnr.it:8000/Tutorials/KEYnet/Table1.html


KEYnet QUERY SYSTEMS

KEYnet database can be queried through the RETKEY program, written in FORTRAN and C, available at the CNR Research Area of the Bari server. A slightly different version is KEYnetWWW (http://www.ba.cnr.it/keynet.html ), which is more powerful because it can be accessed worldwide and the retrievable information is more complete.

The usage of KEYnetWWW is described in the following examples. Searching for glutamine synthetase nucleotide sequences in the KEYnet database (http://bio-www.ba.cnr. it:8000/Tutorials/KEYnet/example1 ) we obtain 257 entries from release 58 of the EMBL data library. Searching for the same protein starting from the ENZYME database through the SRS (10) retrieval system (http://bio-www.area.ba.cnr.it:8000/Tutorials/KEYnet/example2 ) gives 148 entries from the same EMBL data library release. The retrieved data have been carefully revised and the numbers actually refer to entries related to nucleotide sequences coding for glutamine synthetase.

Users of KEYnet are kindly invited to cite the present article.


ACKNOWLEDGEMENTS

This work has been partially supported by the EU-Biotechnology Programme (Contracts n. BIO4-CT95-0037 and BIO4-CT97-0), by ‘Programma Biotecnologie legge 95/95 (MURST 5%)’, by MPI (Italy) and by CNR Research Area of Bari (IT).


REFERENCES
1.. Tullo, A.. , Liuni,S. and Attimonelli,M. (1990) Protein Seq. Data Anal., 3, 327–334. [pmid: 2235975]
2.. Liciulli, F.. , Catalano,D., D’Elia,D., Lorusso,V. and Attiminelli,M. (1999) Nucleic Acids Res., 27, 365–367. [pmid: 9847230]
3.. Attimonelli, M.. , Altamura,N., Benne,R., Boyen,C., Brennicke,A., Carone,A., Cooper,J.M., D’Elia,D., de Montalvo,A., de Pinto,B., De Robertis,M., Golik,P., Grienenberger,J.M., Knoop,V., Lanave,C., Lazowska,J., Lemagnen,A., Malladi,B.S., Memeo,F., Monnerot,M., Pilbout,S., Schapira,A.H.V., Sloof,P., Slonimski,P., Stevens,K. and Saccone,C. (1999) Nucleic Acids Res., 27, 128–133. Updated article in this issue: Nucleic Acids Res. (2000), 28, 148–152. [pmid: 9847157]
4.. Stoesser, G.. , Tuli,M.A., Lopez,R. and Sterk,P. (1999) Nucleic Acids Res., 27, 18–24. Updated article in this issue: Nucleic Acids Res. (2000), 28, 19–23. [pmid: 9847133]
5.. Dennis, A.. , Benson,M., Boguski,S., Lipman,D.J., Ostell,J., Ouellette,B.F.F., Rapp,B.A. and Wheeler,D.L. (1999) Nucleic Acids Res., 27, 12–17. Updated article in this issue: Nucleic Acids Res. (2000), 28, 15–18. [pmid: 9847132]
6.. Bairoch, A.. and Apweiler,R. (1999) Nucleic Acids Res., 27, 49–54. Updated article in this issue: Nucleic Acids Res. (2000), 28, 45–48. [pmid: 9847139]
7.. Bairoch, A.. (1999) Nucleic Acids Res., 27, 310–311. [pmid: 9847212]
8.. Lonsdale, D.M.. and Leaver,C.J. (1988) Plant Mol. Biol., 6, 14–21.
9.. Hallick, R.B.. (1989) Plant Mol. Biol., 7, 266–275.
10.. Etzold, T.. , Ulyanov,A. and Argos,P. (1996) Methods Enzymol., 266, 114–128. [pmid: 8743681]

Article Categories:
  • Article


Previous Document:  KinMutBase, a database of human disease-causing protein kinase mutations.
Next Document:  AAindex: amino acid index database.