Microsoft AI diagnoses complex medical cases with 85% accuracy, research finds

Microsoft AI diagnoses complex medical cases with 85% accuracy, research finds

2 minutes, 35 seconds Read

Microsoft has developed an AI-gel-off diagnostic system, the Microsoft AI Diagnostic Orchestrator (Mai-Dxo), which can accurately diagnose according to a complex medical cases Recent experiment.

“In combination with the O3 model of OpenAI, Mai-Dxo reaches 80% diagnostic accuracy four times higher than the 20% average of generalist doctors. Mai-Dxo also reduces diagnostic costs by 20% compared to doctors and 70% compared to off-the-shelf O3,” the study authors wrote.

“When configured for maximum accuracy, Mai-Dxo achieves 85.5% accuracy. These performance profits with Mai-Dxo generalize about models of the OpenAi, Gemini, Claude, Grok, Deepseek and Lama families.”

The Microsoft team tested Mai-Dxo against 304 Real-World Case studies of the New England Journal of Medicine, and the AI ​​system have not only diagnosed 85.5% of the cases correctly, but used fewer resources than the group of experienced doctors to do this.

Researchers evaluated 21 practicing doctors, each with five to 20 years of clinical experience, in both the UK and the US, the doctors all got the same tasks and achieved an average accuracy of 20% in the completed cases.

Researchers also stated that although medical specialists are experts in a specific area of ​​the body or a certain type of disease, no doctor can be an expert in every complex medical case.

The Microsoft team stated that AI does not have that restriction and at the same time attract knowledge on various medical fields, further than what a single doctor can do.

“The Mai-DX Orchestrator turns each language model into a virtual panel of clinici: it can ask follow-up questions, order tests or deliver a diagnosis, then perform a cost control and verify his own reasoning before they decide whether they will continue,” the authors wrote. “This kind of advanced thinking can change the way in which health care works.”

The larger trend

The researchers at Microsoft noticed restrictions in their experiment, including an unrealistic case mix, because the benchmark cases investigated were derived from complex, educational cases in the NEJM and not healthy persons or patients with mild disorders.

Researchers said it was unclear whether the AI ​​would perform just as well on every day, routine cases or how often it would give false positives.

The test was also limited because there was no restrictions in practice, including factors such as patients with the patient, waiting times, insurance restrictions, test availability and delays when receiving results.

Evaluation of the test costs was based on simplified American averages and did not take into account differences in costs between payers, providers, health systems or geography.

Finally, the study compared the AI ​​of Microsoft with doctors for internal care and doctors in primary care, but not specialists. Moreover, the doctors who participated were limited to use internet sources, while in reality doctors often consult guidelines, colleagues and numerous other tools during the diagnosis.

“While we recognize these restrictions, our results point to possible accuracy gain, especially when clinicians in the institutions for remote and too little are considering institutions, and also give us a picture of how LMS could increase medical expertise to improve health results, even in appropriate institutions,” wrote the Microsoft team.

#Microsoft #diagnoses #complex #medical #cases #accuracy #research #finds

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *