|Update of KEYnet: a gene and protein names database for biosequences functional organisation.|
|Jump to Full Text|
|PMID: 10592277 Owner: NLM Status: MEDLINE|
|KEYnet is a database where gene and protein names are hierarchically structured. Particular care has been devoted to the search and organisation of synonyms. The structuring is based on biological criteria in order to assist the user in data search and to minimise the risk of information loss. Links to the EMBL data library by the entry name and the accession number are implemented. KEYnet is available through the WWW at the following site: http://www.ba.cnr.it/keynet.html|
|D Catalano; F Licciulli; D D'Elia; M Attimonelli|
Related Documents :
|16595557 - Nq-flipper: validation and correction of asparagine/glutamine amide rotamers in protein...
17945017 - The protein identifier cross-referencing (picr) service: reconciling protein identifier...
18629297 - Protein name tagging guidelines: lessons learned.
21059677 - Cpla 1.0: an integrated database of protein lysine acetylation.
22564897 - Kobamin: a knowledge-based minimization web server for protein structure refinement.
23416357 - Thermally responsive silicon nanowire arrays for native/denatured-protein separation.
|Type: Journal Article; Research Support, Non-U.S. Gov't|
|Title: Nucleic acids research Volume: 28 ISSN: 0305-1048 ISO Abbreviation: Nucleic Acids Res. Publication Date: 2000 Jan|
|Created Date: 2000-02-25 Completed Date: 2000-02-25 Revised Date: 2009-11-18|
Medline Journal Info:
|Nlm Unique ID: 0411011 Medline TA: Nucleic Acids Res Country: ENGLAND|
|Languages: eng Pagination: 372-3 Citation Subset: IM|
|Area di Ricerca, CNR, 70126 Bari, Italy.|
|APA/MLA Format Download EndNote Download BibTex|
Information Storage and Retrieval
Proteins / chemistry, genetics*
Journal ID (nlm-ta): Nucleic Acids Res
Publisher: Oxford University Press, Oxford, UK
Copyright © 2000 Oxford University Press
Received Day: 4 Month: 10 Year: 1999
Revision Received Day: 13 Month: 10 Year: 1999
Accepted Day: 13 Month: 10 Year: 1999
Print publication date: Day: 1 Month: 1 Year: 2000
Volume: 28 Issue: 1
First Page: 372 Last Page: 373
Publisher Id: gkd070
PubMed Id: 10592277
|Update of KEYnet: a gene and protein names database for biosequences functional organisation|
|Area di Ricerca, CNR, 70126 Bari, Italy and
1Department of Biochemistry and Molecular Biology, Faculty of Sciences, University of Bari, 70126 Bari, Italy
|aTo whom correspondence should be addressed. Tel: +39 080 548 2130; Fax: +39 080 548 4467; Email: firstname.lastname@example.org
The most common interrogation criteria for bio-databases are gene and protein names but, so far, the majority of them have been incorrectly annotated in the nucleic acid sequence databases which causes inconsistencies in data retrieval. In order to properly target retrieval using such criteria, gene and protein names need to be correctly coded. Here we present the database KEYnet (1,2) where gene and protein names are organised in a hierarchical structure according to the biological function of the associated sequence. Links among lexical or biological synonyms are implemented.
Each entry in the KEYnet database is related to a gene or protein name. The whole database is hierarchically structured according to the scheme previously reported (1,2) and visible at http://bio-www.ba.cnr.it:8000/Tutorials/KEYnet/network.html . In particular, KEYnet structure is made up of a set of elements, nodes, linked to form a father–son relationship. At the highest level there is the root which links all the branches in the tree. The most important branches are the nodes Protein, DNA and RNA. Each leaf in the tree is composed of several elements linked by synonymy. Two by-side branches are implemented: the RAT Gene Names Tree and the Mitochondrial Genome Tree [the Mitochondrion Gene names classification has been structured as a contribution to the MitBASE project (3)]. Gene and protein names are extracted from the EMBL data library (4).
Biological information about associated sequences are extracted from the same primary databases [EMBL data library (4) and GenBank (5)] and from specialised databases such as SWISS-PROT (6), ENZYME (7) or any other suitable database. MEDLINE is also consulted whenever the above mentioned databases do not contain the necessary information for the gene and protein name classification. KEYnet database is updated at each EMBL data library release and, at this time, the link among KEYnet and the EMBL data library is established.
One of the major problems encountered during data classification is the gene names branch. Gene naming is recognised worldwide as a difficult problem, due to the freedom with which users assign a name to a gene whenever it is discovered. Several attempts to address this problem are in progress (8,9; see http://www.ebi.ac.uk:7081/docs/nomenclature and http://www.gene.ucl.ac.uk/nomenclature ).
We have organised gene names by establishing a starting set of main ancestor keywords relevant to their primary biological functions. At present KEYnet contains 66 219 gene and protein names as is reported in detail in the table at http://bio-www. ba.cnr.it:8000/Tutorials/KEYnet/Table1.html
KEYnet database can be queried through the RETKEY program, written in FORTRAN and C, available at the CNR Research Area of the Bari server. A slightly different version is KEYnetWWW (http://www.ba.cnr.it/keynet.html ), which is more powerful because it can be accessed worldwide and the retrievable information is more complete.
The usage of KEYnetWWW is described in the following examples. Searching for glutamine synthetase nucleotide sequences in the KEYnet database (http://bio-www.ba.cnr. it:8000/Tutorials/KEYnet/example1 ) we obtain 257 entries from release 58 of the EMBL data library. Searching for the same protein starting from the ENZYME database through the SRS (10) retrieval system (http://bio-www.area.ba.cnr.it:8000/Tutorials/KEYnet/example2 ) gives 148 entries from the same EMBL data library release. The retrieved data have been carefully revised and the numbers actually refer to entries related to nucleotide sequences coding for glutamine synthetase.
Users of KEYnet are kindly invited to cite the present article.
This work has been partially supported by the EU-Biotechnology Programme (Contracts n. BIO4-CT95-0037 and BIO4-CT97-0), by ‘Programma Biotecnologie legge 95/95 (MURST 5%)’, by MPI (Italy) and by CNR Research Area of Bari (IT).
|1..||Tullo, A.. , Liuni,S. and Attimonelli,M. (1990) Protein Seq. Data Anal., 3, 327–334. [pmid: 2235975]|
|2..||Liciulli, F.. , Catalano,D., D’Elia,D., Lorusso,V. and Attiminelli,M. (1999) Nucleic Acids Res., 27, 365–367. [pmid: 9847230]|
|3..||Attimonelli, M.. , Altamura,N., Benne,R., Boyen,C., Brennicke,A., Carone,A., Cooper,J.M., D’Elia,D., de Montalvo,A., de Pinto,B., De Robertis,M., Golik,P., Grienenberger,J.M., Knoop,V., Lanave,C., Lazowska,J., Lemagnen,A., Malladi,B.S., Memeo,F., Monnerot,M., Pilbout,S., Schapira,A.H.V., Sloof,P., Slonimski,P., Stevens,K. and Saccone,C. (1999) Nucleic Acids Res., 27, 128–133. Updated article in this issue: Nucleic Acids Res. (2000), 28, 148–152. [pmid: 9847157]|
|4..||Stoesser, G.. , Tuli,M.A., Lopez,R. and Sterk,P. (1999) Nucleic Acids Res., 27, 18–24. Updated article in this issue: Nucleic Acids Res. (2000), 28, 19–23. [pmid: 9847133]|
|5..||Dennis, A.. , Benson,M., Boguski,S., Lipman,D.J., Ostell,J., Ouellette,B.F.F., Rapp,B.A. and Wheeler,D.L. (1999) Nucleic Acids Res., 27, 12–17. Updated article in this issue: Nucleic Acids Res. (2000), 28, 15–18. [pmid: 9847132]|
|6..||Bairoch, A.. and Apweiler,R. (1999) Nucleic Acids Res., 27, 49–54. Updated article in this issue: Nucleic Acids Res. (2000), 28, 45–48. [pmid: 9847139]|
|7..||Bairoch, A.. (1999) Nucleic Acids Res., 27, 310–311. [pmid: 9847212]|
|8..||Lonsdale, D.M.. and Leaver,C.J. (1988) Plant Mol. Biol., 6, 14–21.|
|9..||Hallick, R.B.. (1989) Plant Mol. Biol., 7, 266–275.|
|10..||Etzold, T.. , Ulyanov,A. and Argos,P. (1996) Methods Enzymol., 266, 114–128. [pmid: 8743681]|
Previous Document: KinMutBase, a database of human disease-causing protein kinase mutations.
Next Document: AAindex: amino acid index database.