In one study ChatGPT was able to outperform human doctors in diagnosing diseases and medical conditions. The study’s findings were published last month and highlight how artificial intelligence (AI) chatbots could be more efficient at analyzing patient histories and conditions and providing more accurate diagnoses. While the aim of the study was to understand whether AI chatbots could help doctors provide better diagnoses, the results unexpectedly revealed that OpenAI’s GPT-4-powered chatbot performed better than human assistance when paired with a doctor. Performed much better than when performing without.
ChatGPT outperforms doctors in diagnosing diseases
The study, published in the journal JAMA Network Open, was conducted by a group of researchers at Beth Israel Deaconess Medical Center in Boston. The purpose of the experiment was to find out whether AI could help doctors diagnose diseases better than traditional methods.
According to a report in The New York Times, the experiment involved 50 doctors who were a mix of residents and physicians attending medical college. They were recruited through several large hospital systems in the US and were given six case histories of patients. Subjects were reportedly asked to suggest a diagnosis for each case and to provide an explanation of why they supported or rejected certain diagnoses. Doctors are also said to be classified based on whether their final diagnosis was correct or not.
To evaluate each participant’s performance, medical experts were reportedly selected as graders. While they were told they were shown the responses, they were not told whether the responses came from a doctor with access to the AI, only from the doctor, or only from ChatGPT.
Furthermore, to eliminate the possibility of unrealistic case histories, the researchers reportedly chose case histories of real patients that have been used by researchers for decades but have never been published to avoid contamination. This point is important because ChatGPT cannot be trained on data that has never been published.
The study’s findings were surprising. Doctors who did not use any AI tools to diagnose case histories had an average score of 74 percent, while physicians who used a chatbot had an average score of 76 percent. However, when ChatGPT analyzed the case history alone and provided a diagnosis, it scored an average of 90 percent.
While various factors could influence the outcome of the study – from doctors’ experience levels to personal biases with certain diagnoses – the researchers believe the study highlights the potential for AI systems in medical institutions to Cannot be ignored.