AlphaEvolve, Codex AI coding agent, AM-Thinking-v1, Falcon-Edge, Windsurf SWE-1, GPT-4.1 in ChatGPT, Nous Research Psyche, INTELLECT-2 32B, ARI Enterprise, HealthBench. Grok went off the rails.
Share this post
AI Week in Review 25.05.17
Share this post
AlphaEvolve, Codex AI coding agent, AM-Thinking-v1, Falcon-Edge, Windsurf SWE-1, GPT-4.1 in ChatGPT, Nous Research Psyche, INTELLECT-2 32B, ARI Enterprise, HealthBench. Grok went off the rails.