In about one-third of queries to ChatGPT, a team of researchers uncovered that the large language model (LLM) was spitting out erroneous or inappropriate cancer treatment recommendations that didn't align with established medical guidelines.
Study coauthor and Dana-Farber Cancer Institute researcher Danielle Bitterman said that ChatGPT responses sound a lot like a human and can be quite convincing.
"But, when it comes to clinical decision-making, there are so many subtleties for every patient’s unique situation. A right answer can be very nuanced and not necessarily something ChatGPT or another large language model can provide," she added.
For the study, the researchers used 104 prompts related to lung, prostate, and breast cancer. To measure the quality of ChatGPT's advice, they compared its answer to cancer treatment guidelines from the National Comprehensive Cancer Network (NCCN.)
The results showed that ChatGPT generated one or more treatment recommendations that did not align with NCCN at a staggering 34.3 per cent rate. ChatGPT hallucinated 13 out of 104 outputs.
"Developers should have some responsibility to distribute technologies that do not cause harm, and patients and clinicians need to be aware of these technologies’ limitations," Bitterman's team wrote.