Harvard Medical School study found OpenAI's o1 model achieved higher accuracy than human doctors in interpreting written patient records for emergency diagnoses. The research tested the AI system against physicians on diagnostic accuracy using standardized case presentations.
Study limited to written case interpretation, not real-time patient interaction where doctors gather additional context through examination and questioning. Clinical deployment requires regulatory approval and integration with existing diagnostic workflows.
Healthcare IT leaders should evaluate diagnostic AI pilots within existing EHR systems. Medical directors should assess liability and workflow integration requirements before considering emergency department AI tools.