Electronic Health Records Failed to Make Clinicians’ Lives Easier … – JAMA Network

Customize your JAMA Network experience by selecting one or more topics from the list below.
This conversation is part of a series of interviews in which JAMA Editor in Chief Kirsten Bibbins-Domingo, PhD, MD, MAS, and expert guests explore issues surrounding the rapidly evolving intersection of artificial intelligence (AI) and medicine.
Clinicians are cautiously optimistic when AI experts say their biggest hope is that the technology will help to reduce workloads and ease burnout. For now, however, the experts are grappling with how to make those aspirations become reality. Among them is Kevin B. Johnson, MD, MS, the David L. Cohen and Penn Integrates Knowledge University Professor of Pediatrics, Informatics, Engineering, and Communication at the University of Pennsylvania.
In a recent interview with JAMA Editor in Chief Kirsten Bibbins-Domingo, PhD, MD, MAS, Johnson talked about the issues facing scientists, physicians, and others working to ensure that AI doesn’t impose added burdens on medical professionals (Video). The following is an edited version of the interview.
Dr Bibbins-Domingo:You’re a board-certified pediatrician and you’re also recognized for your work in informatics and clinical information technology. As somebody who’s spent a long time thinking about medicine and informatics, tell us why this time that we’re living in is so different.
Dr Johnson:Throughout my life, there’s been 1 or 2 previous moments where it was clear something had changed that was going to impact health care for the better or for the worse. Clearly, the first of those as I was getting into the field was the internet. It was really a sea change. Mobile technology was another one. I was very excited about what was happening—even before the iPhone—with the PalmPilot and did some early research looking at ways we could use PalmPilots. I’ve always been focused on getting clinicians to do direct data capture so that we could use computational means to improve guideline-based care, clinical decision support.

AI’s been around a really long time—since the ’60s, possibly longer, depending on how you define it. But we’ve had a number of what we call AI winters. Most of us have said AI’s kind of smoldering along in the image analysis space. But this year with the unveiling of what’s now being called generative AI, large language models, ChatGPT, everyone is seeing that the genie’s lamp has been opened and we’re having our wishes. No question, we are now in a place where AI is going to be a part of health care in a very visible, meaningful, and pervasive way. We’re all very excited.
Dr Bibbins-Domingo:I’ve heard you use this term “AI and the back office.” What do you mean by that term?
Dr Johnson:What I think people have failed to think about is the entire ecosystem of health care. Looking at the literature now, we’re just starting to see people getting beyond some of the more commonly talked-about areas to questions like “How can we take clinical decision support and critique it before we actually release it?” So, making sure that the messages are clear, making sure that they’re succinct, making sure that the references we’ve chosen match the concern that we might want to have in terms of education is now something that we can do.

Many people have been thinking about the rest of the back office from scheduling to prior authorization. One of my favorite topics, which you could say is sort of front office but maybe not, is helping patients understand the content of messages, especially if those patients have limited English proficiency or might otherwise have challenges with the way the message has been written. Perhaps it was written by a doctor who loves acronyms, for example. I think that as the literature in this evolves over the next few years, we’re going to find that many of the back-office functions get tested before the front-office functions because there’s less at stake. There’s a lot more that we can learn and critique without it impacting a very stressed workforce.
Dr Bibbins-Domingo:You’re describing a set of activities that has the potential to eat away at the time that clinicians have or the things that they work on in the background, but really can be overwhelmingly burdensome for the clinician workforce.
Dr Johnson:That’s right. This is one of the areas that’s been a real concern of mine. I published a paper in JAMA with Bill Stead about going from safer to smarter approaches in thinking about electronic health records. What we meant was that there are already guides in place to help us understand cognitive burden. But right now, there’s really no metric for cognitive support. There’s no way to say, “This particular tool provides more benefit to a particular community than another tool.” We can say that, as I’ve said before, and forgive my vernacular here, that this tool sucks less than the other tools. But what we really want to say is that this tool actually does what patients and providers need to do without fatiguing and causing issues that relate to staffing; in other words, without burdening the health care system.

The way in which JAMA is going to evolve its AI in medicine portfolio speaks to the evidence base that we need. That evidence base is likely to focus on some interesting ways in which we could reduce cognitive burden as well as improve cognitive support. So, I do have some excitement about that, but I also have some skepticism.
Dr Bibbins-Domingo:Certainly the electronic health record was an example of technology that was supposed to make our lives better. You write eloquently with Bill Stead that it has failed at that and you call for new guides to help make electronic health records safer and smarter. But convince me that AI is not just another new technology integrated into the health records that makes my life worse instead of delivering on the promise to smooth things out.
Dr Johnson:That’s hard to do, but I’ll give it a shot. I’ve always been an evangelist for understanding the ways in which we can do direct data capture. So arguably, as a part of the work that I’ve done with the National Academy of Medicine [formerly the Institute of Medicine] as well as with the American Board of Pediatrics and other groups, I’ve been one of the people who helped construct the electronic health record that’s used today. So for me to say honestly that I can predict nothing but good would of course be false. We know that there’s a series of 2-tailed hypotheses here. We know, for example, with patient portal messages being replied to by clinicians, that there’s a double-edged sword there.

So I can’t convince you, but what I can do is to say that properly done studies will help us, I hope, to create some guardrails. And of course, remembering our history with electronic health records should hopefully help us to get out of some of the potential areas of concern.
Dr Bibbins-Domingo:You said properly designed studies. What kinds of studies, either in real-world settings or in the policy and regulatory domain, would you like to see or what types of work would you like to see published since we put out this call for papers?
Dr Johnson:Clinical trials are still the most effective way to demonstrate both intended and unintended side effects, especially if we make sure that we’re focusing on clinical effectiveness. So I would love to see us think about clinical trials for as many of these topics as possible. Some of the data that we can get from secondary use will give us a good sense of what might be possible. In other words, it’s great for efficacy evaluation, but I think that at the end of the day we need to do clinical effectiveness studies and we need to focus on every aspect of the system. And that includes early usability studies, formative assessments, qualitative assessments, using mixed methods, and inviting many stakeholders who are patients, nursing staff, providers into the study design so that we get a holistic perspective. I know you and I think a lot about equity and making sure that we are equity-first in this conversation.

For example, many people now know about ambient scribing—the idea that there may be technologies right around the pike that can take a conversation that I’m having with a patient and summarize that in an arbitrarily long or short set of messages or into a document, but those are going to be very expensive. So to do an efficacy study, we’ll likely demonstrate 1 set of outcomes. To do an effectiveness study means we need to get into other communities and really think about some of the barriers that might be related to cost, education, time, staff turnover, patient understanding of this technology, trustworthiness of health care systems, and health care employees. And those are the kinds of studies I’d like to see done. I’d like to see studies that are fairly holistic. I’m very ambitious about where I’d like to see this field go, but as you said, if we’re skeptics and we should be, we need to understand some of those things going in.
Dr Bibbins-Domingo:It’s been really exciting to talk to people as a part of this series about the potential for enhancing our goals of achieving health equity because of the ability to enhance access, to potentially scale to other populations. But you’re reminding us of this important fact that unless we test new technologies in the settings where we actually hope to apply them, we hardly ever achieve the goals in all of these settings.
Dr Johnson:Absolutely. Early in my career, I was one of the people looking at the use of text messages for behavior change, and I can still remember after a very successful project that was funded by Robert Wood Johnson and the Agency for Healthcare Research and Quality, presenting the work at Meharry Medical College. I was very excited. I brought the technology to the room, and then at the end, after we talked about and discussed some of the next steps that needed to be done, one student raised his hand and said, “Why did you build this in the iOS platform when most of our patients use Androids?” And that was one of those wake-up calls where I had no answer. I’d really not thought about it. I gave the typical sort of academic response, which is, “Well, I was only funded this amount of money to do this particular project, but you’re absolutely right.”

The reality was, it’s a wake-up call from the very beginning that these types of initiatives need to be considered from an equitable lens first, especially while we have the funding to study them. Because typically, you get 1 bite at that apple, and from that point on, it’s going to be technology transfer or other methods to study it.
Dr Bibbins-Domingo:You’re clearly enthusiastic about the possibility for AI to transform even these terrible electronic health records and a lot that we’re doing in the practice of medicine. But I also sense that there are things we should be cautious about. I know that you have played an important role in the National Academy of Medicine convening a multidisciplinary group of stakeholders to think about a new code of conduct for AI. Tell me a little bit about what you’re worried about.
Dr Johnson:Leading up to this code of conduct work, a number of us who are leaders in informatics have had a weekly Friday session—an unmeeting, if you will—where we think about various topics. When the news broke about ChatGPT, we all went through the Gartner Hype Cycle [the 5 stages of a technology’s life cycle]. We all had amazing thoughts about what could happen, as we’ve discussed right now, but it took very little time for us to get to the trough of disillusionment. Probably the first example of that that we talked about was Ziad Obermeyer’s work from the University of California, Berkeley. He talked about the issue of patients who are Black being given lower scores on an AI algorithm, and therefore being less eligible for care coordination, even though the data that were used to generate those scores were biased. Had these patients been manually reviewed, which is what he did, they would have received more care coordination than they otherwise did.

So the issue of algorithmic fairness became very important to us very early. We recognize that there are biases in us, and that’s been shown many times. There are biases in the data that are reflecting the society in which those data are generated. There’s also going to be biases in access to this information, and there’s going to be disparities in the kinds of questions that might be addressed, again, because of funding. As you and I know, one of the challenges of Brown and Black people and some of the topics that are typically brought out is that they’re much more difficult to get funded. A great project that was orchestrated by the National Institutes of Health reviewed that. So we can expect that some of the topics that should be most relevant to researchers will be left behind. Therefore, one of the things that we all thought about from the very beginning was how do we continue to move this forward without leaving groups behind, without introducing systemic biases into an entire global setting.

The other thing we worried about is health policy because we already can see whenever we talk about things that scare people, the next question is, “How should we regulate this?” And of course, I’m of 2 minds about that. If we were to regulate industries—for example, clinical decision support, and the FDA [US Food and Drug Administration] has come out with reasonable rules about that and ONC [Office of the National Coordinator for Health Information Technology] is working on new proposed rulemaking—we risk ossifying innovation. There’s a point at which it makes a lot of sense to watch and be careful about how we set up those guardrails that are policy. We don’t know that we’re there yet because there’s so much to learn. The technology itself has so much potential, and importantly, a lot of bad actors in the world, and this is something that Peter Lee from Microsoft and Sam Altman [chief executive officer of OpenAI] and others have said isn’t going to stop.

So it’s almost more important at this point that we understand what’s possible. In the conversation that we had about our concerns in medicine as well as the opportunities was born this idea of “How should we create these guardrails?” Although we called them a code of conduct, they really are all about how we align and learn iteratively from what should be allowed and what is capable. So what the National Academy of Medicine has done is assemble a group of talented people from Google, Microsoft, various companies, and other groups to think critically about this. Michael McGinnis from the National Academy of Medicine has been committed to this being a learning environment where we don’t write a report and put it on the shelf. The technology’s moving fast enough that we generate material, we distribute that, but then we also respond to how things are changing over time.
Dr Bibbins-Domingo:I’ve heard a number of people say that it really will challenge the regulatory systems. The very nature of generative AI is that it’s not necessarily producing the same product each time and we don’t quite have the structures to set up to know how we regulate that. It’s interesting to think also of the speed with which these technologies are developing over time and what regulation means in that regard. When the National Academies put together a code of conduct, who should that code of conduct apply to—are you speaking to the computer scientist? Are you speaking to the health systems adopting a new technology?
Dr Johnson:We’re still in the formative stages, so anything I say about this is going to be premature. But what we know for sure is that we are going to leave no one out from the health care ecosystem. We have representation from patients, innovators, physicians. I suspect that we’re going to want to make sure it applies very well to the development process because of the work that Obermeyer and others have done to show these biases. So I think we’re going to start from the very beginning of the pipeline and look at what problems are being chosen and go through the process of what data are valid to help solve that problem. One of the topics that many people here might not have heard about is this idea of what’s called XAI, or explainable AI.

So the question of what models we choose is directly related to how much we believe that this decision should be explainable. I’m certain we’re going to spend some time talking about the role of explainable AI—when is it necessary, when is it not —all the way to how comfortable should we feel putting this kind of technology in the hands of patients without providers? And of course, how do we integrate it into clinical decision support? How do we potentially begin to think about augmenting a lot of our care with AI? Something challenging that I’m thinking about quite a bit has to do with the work we’ve done not in generative AI, but in predictive AI. We will soon be able to rereview a chest CT [computed tomographic scan] and possibly take a CT that was reported as normal and now have findings that are generated.

It’s very similar to what’s been happening in genomic medicine. So now we have another ethical question, and this will certainly be something we bring up in the code of conduct, which is what should we recommend? Is it a situation where every CT that gets reevaluated and we identified maybe a nodule that was more concerning to the AI than it might’ve been to the original reader—should we recontact the patient and the original provider? It’s a completely unanswered question that sort of gave me hives because this topic about recontact when there’s a variant of unknown significance is still largely unappreciated.

We know that we should, but we don’t know what it means. Patients’ phone numbers change, patients’ addresses change, data change. So we might recontact the patient today and find out that new data now suggest that it’s not significant. In this rapidly moving technologically advancing society, I think the code of conduct team is going to have an opportunity to think about a lot of quite vexing problems.
Published Online: October 4, 2023. doi:10.1001/jama.2023.19138
Conflict of Interest Disclosures: None reported.
Hswen Y, Voelker R. Electronic Health Records Failed to Make Clinicians’ Lives Easier—Will AI Technology Succeed? JAMA. Published online October 04, 2023. doi:10.1001/jama.2023.19138
© 2023
Artificial Intelligence Resource Center
© 2023 American Medical Association. All Rights Reserved.
Terms of Use| Privacy Policy| Accessibility Statement| Cookie Settings