Opportunities and Pitfalls of ChatGPT in Health Care
How will chatbots like ChatGPT or Bing alter the health care space, and what are the potential benefits and concerns associated with clinicians and patients using artificial intelligence language models?
Penn’s Center for Health Incentives and Behavioral Economics (CHIBE) spoke with several of its affiliates and partners about the opportunities and pitfalls of ChatGPT, the chatbot from OpenAI that is capable of everything from writing poetry and news articles, to translating, to analyzing language patterns for researchers. Specific to health care, ChatGPT said it could be used in the following ways:
- Patient education and support, by providing patients with information on health topics, such as symptoms, treatments, and medications.
- Telemedicine, by helping collect patients’ medical histories and current symptoms and triaging cases.
- Clinical decision support, by integrating with electronic health record systems and providing real-time recommendations on treatment options, medication dosages, and other clinical decisions.
- Research, by analyzing and synthesizing large volumes of clinical data, identifying patterns and trends, and generating hypotheses.
- Quality improvement, by analyzing patterns of care and patient outcomes and identifying areas where improvement is needed and providing feedback and support to providers.
CHIBE spoke with affiliates Ravi Parikh, MD, MPP, FACP; Kit Delgado, MD, MS; Justin Bekelman, MD; and Anna Morgan, MD, about their thoughts on ChatGPT in health care.
On the potential benefits of ChatGPT in health care:
Dr. Parikh said ChatGPT could be helpful in making the writing process more efficient by automating certain aspects of scientific material (grants, abstracts, manuscripts, etc) in a way that researchers can focus more on the science and less on the way information is framed or presented. He also sees ChatGPT as an exciting tool for cutting through some bureaucracy. One issue he encounters in his practice as an oncologist is having to spend a lot of time on the phone or drafting a letter to communicate with insurance companies about facilitating prior authorization. ChatGPT could be used to draft an appeal to an insurer’s decision to not cover a treatment, for example.
On the patient side, Dr. Parikh said he is excited about the possibility of ChatGPT being used to answer some patient questions.
“I’m excited about the possibility of ChatGPT being used to create more aggregated, hopefully accurate information about disease, public health trends, the COVID pandemic, etc, in a way that cuts through the fluff where people normally get their information – from social media, or patient-facing websites, or whatever it is. So, ChatGPT can serve as an information aggregator and present results in an unbiased fashion,” Dr. Parikh said.
It could also be used to hone discussions or inform patients about potential questions to ask their doctor. For example, if a patient has some subtle shortness of breath, they can ask ChatGPT what the most common causes of shortness of breath in the general population are, and it may give some more benign answers, but it may encourage patients to ask their physicians about certain things like cancer screenings.
“Those kinds of things could be really helpful because we’re not perfect in the doctor’s office,” Dr. Parikh said. “A lot of times, we only have the time to address a couple of things, unless a patient brings something up.”
Dr. Delgado, who is director of the Penn Medicine Nudge Unit, similarly noted how ChatGPT and similar models can help people wade through the large amount of information online.
“There is so much knowledge available to guide both clinicians and patients,” Dr. Delgado said. “ChatGPT and other large language models go beyond search engines to synthesize available knowledge quickly into summaries and formats appropriate for various types of audiences. There are also enormous challenges with clinician burnout, workload, and burdens that could aided by AI tools like ChatGPT.”
Dr. Delgado noted several potential applications that have been reported on already:
- Provide clinical guidance, triage, and care navigation for patients
- Reduce burdens on clinicians by summarizing clinical notes and data for discharge summaries, prior authorization requests, or for monitoring patient outcomes
- Summarize evidence from medical literature to guide clinical practice
- Serve as a research assistant for academicians including pulling together key references, summarizing text
- Generate clinical schedules
- Find bugs in code to extract or process data or to perform statistical analyses and correct that code
- Automate rules-based processes involving unstructured data and processing of natural language data.
On the pitfalls of ChatGPT in health care:
Dr. Parikh, who runs the Human Algorithm Collaboration Laboratory (HACLab), warned against patients using ChatGPT as a substitute for medical care.
“As a patient information tool, I don’t think it’s there yet,” Dr. Parikh said. “And I know because I’ve seen articles that demonstrate what it says for patients with concerns, and it oftentimes is inaccurate or it’s not giving the right type of information. So, I am just not confident in that. I would hope that the decision to see a doctor or not see a doctor is made independently of what ChatGPT says.”
Dr. Parikh also highlighted an issue he found when doing some preliminary searches using both the paid and free version of ChatGPT.
“There was a striking difference in quality of patient-facing information, depending on whether you’re use the free version or paid version,” he said. “I think that’s somewhat of an ethical concern, just given that, ideally, you’d want some sort of large language model to be getting the same amount of information to everybody.”
“Tools like this are best for complementing or aiding human effort as they need continuous oversight,” Dr. Delgado added. “There will also be enormous regulatory and governance challenges to ensure safe application and avoid known biases in AI that can be perpetuated. Also, these tools can’t live alone in isolation. They need to be carefully implemented within health care workflows.”
On changes to patient and clinician behavior:
Dr. Delgado saw these language models as a tool clinicians and researchers will use to try to reduce their work burdens and to serve as an assistant.
“Despite cautions, patients will naturally use these types of tools to seek medical advice because current avenues to seek timely medical advice are costly, difficult to access, and wait times are far too long. These tools have the potential to both under- and over-triage patients to the emergency department,” Dr. Delgado said.
Dr. Morgan, who has led several text-based programs at Penn (e.g., COVID Watch, a hypertension management chatbot, and care transitions text messaging programs) thinks ChatGPT has the potential to create a more natural interaction with the patient, though she expressed concern that “if we don’t use the technology with care, we may end up missing important signs of clinical need or deterioration that a human may be better able to identify.”
Final thoughts
“The public release of ChatGPT and the Bing version is a watershed moment in making AI instantly accessible to the masses and will pave the way for a new era of health care innovation,” Dr. Delgado said. “Every business sector is already thinking about how these tools can drive innovation and health care is no different.”
Microsoft recently released its GPT-4-powered Bing chatbot, which differs from ChatGPT in that Bing’s AI chat is connected to the Internet and can therefore access more information. According to OpenAI, ChatGPT’s knowledge is limited to information from 2021 and earlier and cannot answer questions related to news and trends of today.
Read more about Dr. Delgado’s thoughts on ChatGPT here on this Twitter thread.
Dr. Bekelman, who is director of the Penn Center for Cancer Care Innovation at the Abramson Cancer Center, shared his vision of how we should approach language models and maintained that our focus and investments should be more on people than the models.
“It’s been wild to see the excitement about large language models and generative AI in health care. We’ve seen the potential for success (surprisingly good insights) and for failure (unexpectedly and sometimes disturbingly wrong responses),” Dr. Bekelman said. “Our focus now should turn to optimizing how these models enhance and amplify our work as clinicians and health care leaders. The quadruple aim remains. How can generative AI help us achieve it? I think it starts with our people, not with the technology. We need to invest in our people to create the vision of how generative AI can augment our efforts, to design the models that balance algorithmic automation with human control, to optimize the models to minimize bias and engender trust, to test the models under ethical governance and stakeholder input frameworks, and to drive adoption not only by data science teams but also by clinicians and patients themselves. It’s our people – not the models – that will help us achieve the potential impact that generative AI could have in health care.”