Document Detail

Finding falls in ambulatory care clinical documents using statistical text mining.
MedLine Citation:
PMID:  23242765     Owner:  NLM     Status:  MEDLINE    
OBJECTIVE: To determine how well statistical text mining (STM) models can identify falls within clinical text associated with an ambulatory encounter.
MATERIALS AND METHODS: 2241 patients were selected with a fall-related ICD-9-CM E-code or matched injury diagnosis code while being treated as an outpatient at one of four sites within the Veterans Health Administration. All clinical documents within a 48-h window of the recorded E-code or injury diagnosis code for each patient were obtained (n=26 010; 611 distinct document titles) and annotated for falls. Logistic regression, support vector machine, and cost-sensitive support vector machine (SVM-cost) models were trained on a stratified sample of 70% of documents from one location (dataset Atrain) and then applied to the remaining unseen documents (datasets Atest-D).
RESULTS: All three STM models obtained area under the receiver operating characteristic curve (AUC) scores above 0.950 on the four test datasets (Atest-D). The SVM-cost model obtained the highest AUC scores, ranging from 0.953 to 0.978. The SVM-cost model also achieved F-measure values ranging from 0.745 to 0.853, sensitivity from 0.890 to 0.931, and specificity from 0.877 to 0.944.
DISCUSSION: The STM models performed well across a large heterogeneous collection of document titles. In addition, the models also generalized across other sites, including a traditionally bilingual site that had distinctly different grammatical patterns.
CONCLUSIONS: The results of this study suggest STM-based models have the potential to improve surveillance of falls. Furthermore, the encouraging evidence shown here that STM is a robust technique for mining clinical documents bodes well for other surveillance-related topics.
James A McCart; Donald J Berndt; Jay Jarman; Dezon K Finch; Stephen L Luther
Related Documents :
25111305 - Mutual reinforcement between neuroticism and life experiences: a five-wave, 16-year stu...
25134005 - Predicting human movement with multiple accelerometers using movelets.
23163835 - Development of a nomogram for predicting the stone-free rate after transurethral ureter...
24893035 - Optimizing denominator data estimation through a multimodel approach.
14971275 - Turning gadflies into allies.
24742555 - Nitrogen deposition in spain: modeled patterns and threatened habitats within the natur...
Publication Detail:
Type:  Journal Article; Research Support, U.S. Gov't, Non-P.H.S.     Date:  2012-12-15
Journal Detail:
Title:  Journal of the American Medical Informatics Association : JAMIA     Volume:  20     ISSN:  1527-974X     ISO Abbreviation:  J Am Med Inform Assoc     Publication Date:    2013 Sep-Oct
Date Detail:
Created Date:  2013-08-12     Completed Date:  2013-12-17     Revised Date:  2014-09-02    
Medline Journal Info:
Nlm Unique ID:  9430800     Medline TA:  J Am Med Inform Assoc     Country:  United States    
Other Details:
Languages:  eng     Pagination:  906-14     Citation Subset:  IM    
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Accidental Falls / statistics & numerical data*
Ambulatory Care
Ambulatory Care Information Systems*
Area Under Curve
Data Mining*
Electronic Health Records*
Logistic Models
Models, Statistical*
Support Vector Machines

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Secure messaging and diabetes management: experiences and perspectives of patient portal users.
Next Document:  Microstructure and chemistry affects apatite nucleation on calcium phosphate bone graft substitutes.