Document Detail

Validation of psoriatic arthritis diagnoses in electronic medical records using natural language processing.
MedLine Citation:
PMID:  20701955     Owner:  NLM     Status:  MEDLINE    
OBJECTIVES: To test whether data extracted from full text patient visit notes from an electronic medical record would improve the classification of psoriatic arthritis (PsA) compared with an algorithm based on codified data.
METHODS: From the >1,350,000 adults in a large academic electronic medical record, all 2318 patients with a billing code for PsA were extracted and 550 were randomly selected for chart review and algorithm training. Using codified data and phrases extracted from narrative data using natural language processing, 31 predictors were extracted and 3 random forest algorithms were trained using coded, narrative, and combined predictors. The receiver operator curve was used to identify the optimal algorithm and a cut-point was chosen to achieve the maximum sensitivity possible at a 90% positive predictive value (PPV). The algorithm was then used to classify the remaining 1768 charts and finally validated in a random sample of 300 cases predicted to have PsA.
RESULTS: The PPV of a single PsA code was 57% (95% CI 55%-58%). Using a combination of coded data and natural language processing (NLP), the random forest algorithm reached a PPV of 90% (95% CI 86%-93%) at a sensitivity of 87% (95% CI 83%-91%) in the training data. The PPV was 93% (95% CI 89%-96%) in the validation set. Adding NLP predictors to codified data increased the area under the receiver operator curve (P < 0.001).
CONCLUSIONS: Using NLP with text notes from electronic medical records improved the performance of the prediction algorithm significantly. Random forests were a useful tool to accurately classify psoriatic arthritis cases to enable epidemiological research.
Thorvardur Jon Love; Tianxi Cai; Elizabeth W Karlson
Related Documents :
6982965 - Attenuation compensation in single-photon emission tomography: a comparative evaluation.
15035515 - Model-based detection of lung nodules in computed tomography exams. thoracic computer-a...
22508025 - Tensegrity finite element models of mechanical tests of individual cells.
18467765 - Curve-skeleton extraction using iterative least squares optimization.
23576285 - The skeleton flight apparatus of north american bluebirds (sialia): phylogenetic thrush...
12621285 - The accuracy of stereolithography in planning craniofacial bone replacement.
Publication Detail:
Type:  Journal Article; Validation Studies     Date:  2010-08-10
Journal Detail:
Title:  Seminars in arthritis and rheumatism     Volume:  40     ISSN:  1532-866X     ISO Abbreviation:  Semin. Arthritis Rheum.     Publication Date:  2011 Apr 
Date Detail:
Created Date:  2011-03-29     Completed Date:  2011-07-19     Revised Date:  2014-09-08    
Medline Journal Info:
Nlm Unique ID:  1306053     Medline TA:  Semin Arthritis Rheum     Country:  United States    
Other Details:
Languages:  eng     Pagination:  413-20     Citation Subset:  IM    
Copyright Information:
Copyright © 2011 Elsevier Inc. All rights reserved.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Arthritis, Psoriatic / classification,  diagnosis*
Electronic Health Records*
Middle Aged
Natural Language Processing*
ROC Curve
Sensitivity and Specificity
Grant Support

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Atypical Neurologic Complications in Patients with Primary Sjögren's Syndrome: Report of 4 Cases.
Next Document:  Clinical, radiologic, and therapeutic analysis of 14 patients with transverse myelitis associated wi...