Document Detail


Classification techniques with minimal labelling effort and application to medical reports.
MedLine Citation:
PMID:  19024498     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
There are a number of approaches to classify text documents. Here, we use Partially Supervised Classification (PSC) and argue that it is an effective and efficient approach for real-world problems. PSC uses a two-step strategy to cut down on the labelling effort. There are a number of methods that have been proposed for each step. An evaluation of various methods is conducted using real-world medical documents. The results show that using EM to build the classifier yields better results than SVM. We also experimentally show that careful selection of a subset of features to represent the documents can improve performance.
Authors:
Fathi H Saad; G Duncan Bell; Beatriz de la Iglesia
Related Documents :
17884328 - The fear of positive evaluation scale: assessing a proposed cognitive component of soci...
9231128 - Continuing medical education: the question of evaluation.
19160628 - A 31-year-old army specialist presenting with acute oligoarthritis.
6806748 - A versatile argon microsurgical laser.
9109328 - Informatics in the care of patients: ten notable challenges.
18430278 - Telemedicine via satellite to support offshore oil platforms.
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't    
Journal Detail:
Title:  International journal of data mining and bioinformatics     Volume:  2     ISSN:  1748-5673     ISO Abbreviation:  -     Publication Date:  2008  
Date Detail:
Created Date:  2008-11-21     Completed Date:  2008-12-30     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  101279469     Medline TA:  Int J Data Min Bioinform     Country:  Switzerland    
Other Details:
Languages:  eng     Pagination:  268-87     Citation Subset:  IM    
Affiliation:
School of Computing Sciences, University of East Anglia, Norwich NR4 7TJ, UK. fathi.saad@uea.ac.uk
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Artificial Intelligence*
Database Management Systems*
Documentation / methods*
Great Britain
Information Storage and Retrieval / methods*
Medical Records Systems, Computerized*
Natural Language Processing*
Pattern Recognition, Automated / methods*

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  A Bayesian framework for knowledge driven regression model in micro-array data analysis.
Next Document:  Evaluation of culture media for Paenibacillus larvae applied to studies of antimicrobial activity.