Natural language processing best for EMR data

A new study examined ways to interpret data on electronic medical records to best address patient safety concerns.

The federal Agency for Healthcare Research and Quality previously developed a set of 20 measures, known as patient safety indicators, that use administrative data to screen for potentially adverse events that occur during hospitalization, according to background information in the article.

“Currently most automated methods to identify patient safety occurrences rely on administrative data codes,” the authors wrote in the Aug. 24/31 issue of JAMA. “However, free-text searches of electronic medical records could represent an additional surveillance approach.

“The development of automated approaches, such as natural language processing, that extract specific medical concepts from textual medical documents that do not rely on discharge codes offers a powerful alternative to either unreliable administrative data or labor-intensive, expensive manual chart reviews.”

Harvey J. Murff, MD, MPH, of the Veterans Affairs Medical Center and Vanderbilt University in Nashville, Tenn., and colleagues conducted a study to evaluate a language processing-based approach to identify postoperative complications within a multi-hospital healthcare network using the same EMR. The study included 2,974 patients undergoing inpatient surgical procedures at six Veterans Health Administration medical centers from 1999 to 2006.

Among the outcomes measured were postoperative occurrences of acute renal failure requiring dialysis, deep vein thrombosis, pulmonary embolism, sepsis, pneumonia or myocardial infarction identified through medical record review as part of the VA Surgical Quality Improvement Program. The researchers determined the sensitivity and specificity of the natural language processing approach to identify these complications and compared its performance with patient safety indicators that use discharge coding information.

The researchers found that in general, using a natural language processing-based approach had higher sensitivities and lower specificities than did the patient safety indicator.

“The increase in sensitivity of the natural language processing-based approach compared with the patient safety indicator was more than 2-fold for acute renal failure and sepsis and over 12-fold for pneumonia,” the authors wrote. “Specificities were 4% to 7% higher with the patient safety indicator method than the natural language processing approach.

“Natural language processing correctly identified 82% of acute renal failure cases compared with 38% for patient safety indicators. Similar results were obtained for venous thromboembolism (59% vs. 46%), pneumonia (64% vs. 5%), sepsis (89% vs. 34%) and postoperative myocardial infarction (91% vs. 89%). Both natural language processing and patient safety indicators were highly specific for these diagnoses.”

The authors suggested that a natural language processing-based approach offers several potential advantages over administrative code-based strategies to identify healthcare quality concerns:

“First is the flexibility of the approach to meet the individual institutional needs. Once documents have been processed, different approaches and query strategies to identify a specific outcome can be implemented at a relatively low programming effort using standard database query applications.

“Second, as opposed to administrative codes, search strategies using daily progress notes, microbiology reports or imaging reports could be monitored on a prospective basis. Thus, this approach could potentially identify complications while a patient is still in the hospital, which could greatly facilitate real-time quality assurance processes.

“Finally, in systems with highly integrated EMRs, prospective surveillance could be extended to the outpatient setting for individuals remaining with the healthcare system.”

To view the study data, visit

About the author 

Leave a Reply

Your email address will not be published. Required fields are marked *