Identifying Catheter-Related Events Through Sentence Classification
Peer reviewed, Journal article
MetadataVis full innførsel
OriginalversjonInternational Journal of Data Mining and Bioinformatics. 2020, 23 (3), 213-233. 10.1504/IJDMB.2020.107877
Infections caused by central venous catheter (CVC) use is a serious and under-reported problem in healthcare. The CVC is almost ubiquitous in critical care because it enables fast circulatory monitoring and central ad- ministration of medication and nutrition. However, the CVC exposes the patient to a risk of blood-stream infections (BSI). Explicit documentation of normal CVC usage and exposure is sparse and indirect in the health record. For a clinician, CVC presence is simple to infer from record statements about procedures, plans and results related to CVC. In order to capture evidence about CVC-related risk of infections and complications, it is important to develop computerized tools that can estimate individual patient days of CVC exposure retrospectively for large cohorts of patients. Towards that objec- tive, we have developed methods for learning classifiers for statements about CVC-related events occurring in the textual health record. This includes developing and testing an annotation ontology of events and indicators, an- notation guidelines, a gold standard of annotated clinical records selected from a corpus of complete health records for more 800 episodes of care and collecting alternate health register evidence for validation purposes. This paper describes the available data and gold standard, feature selection ap- proaches and our experiments with different classification algorithms. We find that even with limited data it is possible to build reasonably accurate sentence classifiers for the most important events. We also find that mak- ing use of document meta information helps improve classification quality by providing additional context to a sentence. Finally, we outline some strate- gies on using our results for future analysis and reasoning about CVC usage intervals and CVC exposure over individual patient trajectories.