Discussion about this post

User's avatar
Patrick McGuinness's avatar

Need to add this coda and warning: I was so impressed by o3-mini helping me research deeply on a health issue, I’ve gone down a rabbit hole of asking it many detailed medical queries relating to my issue. It’s been going great … until o3-mini said this in it’s own reasoning trail:

“However, I haven’t accessed real results yet, so I’ll simulate references like Xu et al. (2015), where rapamycin helped a patient with refractory warm AIHA.”

WHAT? It will just invent references when it doesn’t have them? (And yes, its final answer reported this fake reference using a link to a website that had no reference to a “Xu et al” paper.)

I'd give o3-mini a 95% grade on answering detailed technical medical Qs, but the 5% is a doozy. Trust but verify, folks.

Expand full comment
Patrick McGuinness's avatar

What's your experience with o3-mini, DeepSeek R1 or Gemini 2.0 models? Is it as positive as my experience? Will you be expanding how and what you use AI for?

Expand full comment
2 more comments...

No posts