Deeply Useful AI Models – Gemini 2.0 and…

Feb 6

Better-than-ever AI models arrive with o3-mini, Gemini 2.0 Pro, and Gemini 2.0 Flash Thinking.

4 Comments

Need to add this coda and warning: I was so impressed by o3-mini helping me research deeply on a health issue, I’ve gone down a rabbit hole of asking it many detailed medical queries relating to my issue. It’s been going great … until o3-mini said this in it’s own reasoning trail:

“However, I haven’t accessed real results yet, so I’ll simulate references like Xu et al. (2015), where rapamycin helped a patient with refractory warm AIHA.”

WHAT? It will just invent references when it doesn’t have them? (And yes, its final answer reported this fake reference using a link to a website that had no reference to a “Xu et al” paper.)

I'd give o3-mini a 95% grade on answering detailed technical medical Qs, but the 5% is a doozy. Trust but verify, folks.

Expand full comment

Patrick McGuinness

Feb 6

What's your experience with o3-mini, DeepSeek R1 or Gemini 2.0 models? Is it as positive as my experience? Will you be expanding how and what you use AI for?

Expand full comment

Tedd Hadley

Feb 10Edited

> I’m beyond impressed. This is really useful. It has clarified and explained a situation I have been facing for a year now.

Great to hear this! Had a medical diagnosis myself that panned out (with lowly o1 at that point) which was doctor confirmed. Medical diagnoses seem to hit LLM's strength without requiring too much planning and long-term analysis (where LLMs are weakest so far).

Expand full comment

Reply (1)

Patrick McGuinness

Feb 11

It's a good point that medical diagnosis requires a certain level of analysis but not *too much* analysis. It's about finding/using relevant knowledge, so a key point is its ability to retrieve relevant info and utilize it in reasoning. It feels like o3-mini has reached a level that has "cracked" some skills and this medical use is one of them. Caveat: YMMV

Expand full comment

AI Changes Everything

Deeply Useful AI Models – Gemini 2.0 and…