Focus On: Physician Burnout
Focus On: Physician Burnout
August 10, 2018
4 min read

Natural language processing and the practice of hematology/oncology

You've successfully added to your alerts. You will receive an email when new content is published.

Click Here to Manage Email Alerts

We were unable to process your request. Please try again later. If you continue to have this issue please contact

I will be in clinic the morning after I write this.

Like most Tuesday mornings, I have a full schedule and need to stay close to being on time, given that several of my patients will go directly from the clinic to the infusion center for treatment.

We have invested significant resources into infusion center scheduling software, which has markedly improved clinic efficiency but relies up to a point on my ability to see patients in a timely fashion and get them to infusion on schedule. The software ultimately relies on the human element to be useful.

John Sweetenham, MD, FRCP, FACP
John Sweetenham

I have a couple of additional minor challenges for tomorrow’s clinic. Our pathology IT system is being upgraded, so results will be entered manually. This adds about 15 minutes of lag time to entry of data into the electronic health record. Also, my clinic is larger than usual because of some staff absences.

My chances of staying on top of documentation during clinic are small so, like many of you, I anticipate spending a large part of my afternoon and possibly longer catching up on dictation. Yes, I know — it’s a generational thing, but I still dictate my notes.

A narrative component

I’m not complaining, nor am I looking for sympathy. I know most of you have a much higher burden of clinic work and documentation than me.

I also realize I may be adding to that burden by choosing to dictate my notes rather than type them. When I confess — note the choice of word — to dictating notes in 2018, I do so with a mixture of embarrassment and guilt.

I have two main reasons for staying with dictation over typing my notes directly into the EHR.

First, after using a keyboard for at least 40 years, I am still essentially a “two-finger” typist. I’m simply too slow and spend endless time correcting my own errors.

Second, I still believe that my clinic notes should have a narrative component — my note is there to tell the patient’s story — to provide more context for me, colleagues I consult with and the patient. This doesn’t constrain me to template language, dot phrases and other electronic shortcuts in the EHR. In the same way that many musicians believe that music recorded in a digital format loses a certain indefinable quality compared with live sound or music recorded on vinyl, the same is true for me with respect to my clinic note.


My sense of guilt comes from the fact that I know my dictated notes add to the growing amount of unstructured data being stored in our electronic systems. As someone who believes wholeheartedly in the power of big data, analytics and artificial intelligence in oncology — as well as the increasing importance of real-world data for decision support, demonstrating value, tracking outcomes and multiple other uses — I realize I am contributing to the huge amount of data trapped in an unstructured format.

If extracting structured data from narrative text is like looking for a needle in a haystack, I am helping to build the haystack. I should seize every opportunity to enter patient-related data in an accessible, structured way but, at the moment, it just doesn’t work well for me in the clinic.

NLP software

Like many others, I have high hopes for new software platforms that can extract structured data elements from unstructured text.

Many institutions and vendors are investing major resources into natural language processing (NLP) platforms, as well as other techniques, to recover these data. These efforts seem to be paying off, because data from many sources — such as pathology and radiology reports — as well as elements of clinic notes are increasingly accessible using these methods. I can feel slightly less guilty about my resistance to the constraints of the EHR.

That said, NLP software is imperfect and anything that makes entry of structured data easier for us would be a major step forward.

In the meantime, knowing that clinic documentation remains a major dissatisfier for oncologists and that the burden of documentation is a possible contributor to oncologist burnout, anything that can help make this easier deserves attention.

In this regard, voice recognition systems have been around for many years and have been in use in many health care systems. I have used one of these systems at a former institution and found it to be a major advantage in terms of instantaneous readout of my dictation, integration into the EHR and quick turnaround, without the need to go back to a transcribed note hours or days later to review and correct.

This software — which can only have gotten better in recent years — may not overcome the problem of trapped data, but it has the potential to reduce the burden of documentation and, maybe at some future point, interact with NLP software to facilitate access to unstructured data elements.

‘Only as good as the input’


With that in mind, I was intrigued by a paper in JAMA Network Open that investigated the accuracy of speech recognition software.

I would characterize the conclusions of the study as showing that, without appropriate quality assurance, notes generated by speech recognition lack precision.

Researchers looked at 217 notes dictated by 144 unique physicians from different specialties. Medical transcriptionists initially reviewed and corrected dictated notes, which were then returned to the physicians to review and sign.

Comparing the initial dictated speech recognition note with the signed note, about 7% of the notes contained errors that were considered clinically important. This number declined to only 0.4% after review by medical transcriptionists. Common errors were deletions and insertions of words.

It’s no surprise that speech recognition software isn’t perfect. The concern is, of course, that most of us who use or who have used speech recognition do not have the benefit of transcriptionist review. For us, the advantage is the instant gratification of getting this done and, apparently, at least in some contexts, we are not as careful as we should be in our proofreading. Interestingly, the study showed differences according to specialty in this regard.

Given the importance of analytics in current hematology/oncology practice, we all need to recognize the importance of supporting efforts toward maximizing structured data.

At the same time, we need to recognize there exists a tension between the constraints placed by templated notes and the value of narrative in clinical care. I am confident this will be overcome by advances in technology.

This study demonstrates a vulnerability in this process — the accuracy of unstructured or templated data will only ever be as good as the input — and it’s essential that we maintain that perspective.


Zhou L, et al. JAMA Network Open. 2018;doi:10.1001/jamanetworkopen.2018.0530.

For more information:

John Sweetenham, MD, FRCP, FACP, is HemOnc Today’s Chief Medical Editor for Hematology. He also is senior director of clinical affairs and executive medical director of Huntsman Cancer Institute at The University of Utah. He can be reached at

Disclosure: Sweetenham reports no relevant financial disclosures.