AI Week in Review 25.10.11

Grok Imagine, Gemini 2.5 computer use, Gemini CLI extensions, OpenAIs' Apps for ChatGPT, Apps SDK, AgentKit, AgentBuilder, ChatKit, Codex SDK, GPT-5 Pro in API, Ling-1T, Bagel Paris, Ovi, TRM.

Oct 12, 2025

A mural of flowers and flags

AI-generated content may be incorrect. — Figure 1. Still from video from recently updated Grok Imagine, now with audio-video generation. **Luis Catacora** animated artwork in D.C. with Grok Imagine.

Top Tools

With extensions, you can connect the power of Gemini to your everyday workflows and the tools you use most, making Gemini CLI uniquely yours. - Google

Google has released a duo of features that together enhance Gemini CLI as a powerful general AI agent.

Google released Gemini 2.5 Computer Use, a specialized Gemini 2.5 model that can operate UIs to complete tasks. The new model allows developers to build AI agents capable of interacting with user interfaces through agent-directed clicks, scrolling, text input, and filling forms in browsers and mobile apps, controlling computers via a computer_use tool. This model outperforms competitors in web and mobile control benchmarks, such as 79.9% on WebVoyager, while keeping latency down. The Gemini Computer Use model is available via the Gemini API.

Complementing this, Google’s open-source AI coding agent Gemini CLI now supports extensions. This turns Gemini CLI into a more powerful and more general AI agent, allowing Gemini CLI to connect with various tools and personalize developer workflows using “playbooks” and user-defined extensions. Google shared a list of launch partners with Gemini CLI extensions, including Figma, shopify, stripe, and more.

AI Tech and Product Releases

Figure AI unveiled its third-generation humanoid robot, Figure03. It features a 5-hour battery life, wireless charging, enhanced sensors, and is powered by the Helix AI system for vision, language, and action. Figure AI aims to mass-produce these robots, with a factory already capable of producing 12,000 units annually.

A person in a garment carrying a box

AI-generated content may be incorrect. — Figure 2. Figure AI’s Figure 03 robot doing deliveries

OpenAI made several DevDay release announcements, with updates to AI models, Codex, and ChatGPT:

OpenAI introduced “apps in ChatGPT” plus an Apps SDK, a way to run third-party apps directly inside ChatGPT. This makes ChatGPT more of a platform for interactive, personalized chat-first software use. OpenAI announced partners like Spotify and Canva and introduced Apps SDK for other third-party app providers to join the ecosystem. OpenAI’s app platform could make ChatGPT feel like an OS.
OpenAI launched AgentKit, a toolkit for building, deploying, and evaluating agentic workflows. It includes Agent Builder, a low-code visual web app for making and deploying AI agents, an embeddable ChatKit UI SDK, and Evals, expanded agent Evaluation tools.
OpenAI made Codex generally available and added Slack integration and an SDK, an SDK (TypeScript first), plus admin features for enterprise use.
OpenAI added new powerful models GPT-5 Pro and Sora 2 to its APIs.
OpenAI introduced GPT-realtime-mini speech-to-speech API, a fast low-latency speech-to-speech model that maintains high quality for voice agents and conversational applications.

Further details were shared in our OpenAI DevDay article.

Grok Imagine gets video generation with synchronized audio. xAI’s Grok Imagine feature now generates short videos with sound and includes a controversial “spicy” mode. Grok Imagine can turn an image into a video without requiring a prompt, and it’s less censored than Veo3 and Sora 2, allowing some NSFW content. It is fast at image and video generation, and seems strong at creative expression, including anime and futuristic styles, more than high realism, where Sora 2 and Veo 3 are stronger. Grok Imagine is available to try for free but requires paid tier for premium features.

Ant Group’s AI lab inclusionAI unveiled Ling-1T, an open-weight one trillion parameter Mixture-of-Experts model with 37B active parameters . Ling-1T is a non-thinking model yet pushes limits on complex reasoning tasks, with math, reasoning and coding benchmark scores for non-thinking models. Model weights are available via Hugging Face. The team also published a paper “Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models.”

AI21 Labs launched Jamba Reasoning 3B, a compact 3B hybrid SSM-Transformer reasoning model emphasizing speed and efficiency. The model is suitable for local and edge usage, with weights available on Hugging Face.

Bagel.com announces Paris, an open-weight diffusion model, the world’s first such model trained in a decentralized way. The developers shared a Technical Report that explains the decentralized diffusion approach; 8 smaller expert diffusion models were pre-trained in isolation, then merged. Model weights for Paris were shared on HuggingFace.

Google is expanding access to several AI tools:

Google’s try on image tool, which uses AI to show what clothing items might look like on you, is expanding to Australia, Canada, Japan, and now includes shoes.
Google is expanding Opal access to 15 countries, and adding new features, including advanced debugging and a faster, more responsive foundation with parallel runs. Opal allows users to create AI mini-apps using natural language.
Google has expanded AI Mode in Search to 35 new languages and more regions. Google says its Gemini-powered AI Mode now covers over 200 countries/territories.

AI Research News

The open-source project Ovi provides open-source video generation with synchronized audio. The research paper “Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation” describes the model and training. As described on their project GitHub page, Ovi provides text or image to video generation with audio, using accessible tooling and local execution paths, that is at near-parity with leading proprietary systems in the space.

Samsung AI’s Tiny Recursive Model (TRM), a 7M parameter reasoning model presented in “Less is More: Recursive Reasoning with Tiny Networks,” is a new breakthrough in reasoning. TRM is a tiny network that uses recursive self-critique to tackle reasoning tasks. The paper reports strong results on benchmarks such as ARC-AGI relative to much larger models.

OpenAI proposed a framework for defining and measuring political bias in LLMs. A new research post outlines metrics and evaluation setups aimed at separating viewpoint expression from task performance, part of OpenAI’s broader work on model spec and deliberative safety.

AI Business and Policy

OpenAI and AMD announced a large deal for OpenAI to deploy AMD GPUs, with OpenAI deploying 6 gigawatts of AMD Instinct GPUs over several years, starting with an initial one gigawatt slated for the 2^nd half of 2026. The pact includes rights for OpenAI to buy up to 10% of AMD via share purchase rights.

Elon Musk’s xAI is reportedly raising $20 billion, more than initially planned, including an equity component from Nvidia tied to GPU supply. This deal is similar to other Nvidia deals, and underscores Nvidia’s investments in the AI space and the intensifying capital needs for compute for frontier AI.

OpenAI has banned suspected China-linked accounts from ChatGPT, which they believed were linked to Chinese government entities after requests for social media monitoring plans.

IBM and Anthropic are partnering to integrate Anthropic’s Claude LLM into IBM’s software, aiming to provide secure and governed AI solutions for enterprise development. Early adopters have seen a 45% average productivity increase. Simultaneously, Anthropic appointed Rahul Patil as its new CTO, bringing 20 years of infrastructure experience from Stripe and positioning both companies to better serve enterprise AI needs.

EU rolled out their “Apply AI” strategy to improve homegrown AI and curb reliance on US and China in AI. A draft strategy calls for €1B in redirected funding, open-source adoption, and sovereign capabilities, with a notable focus on defense C2 deployments. signaling Europe’s industrial-policy turn in AI. https://www.ft.com/content/ea3d20ed-5b42-45ce-8155-67ef472ae9df Financial Times

AI bubble talk returns. AI exuberance risks are being flagged by major institutions as tech valuations rise. A Bloomberg survey finds many investors see AI spending as excessive relative to returns, even as most expect AI-driven outperformance to persist. The IMF and Bank of England have also warned about a potential “abrupt” correction in AI-linked assets, due to high valuations. I hold that it’s an AI boom not an AI bubble and the AI infrastructure buildout boom is not (yet) excessive.

Minneapolis Fed President Neel Kashkari says AI isn’t (yet) replacing workers, but could pressure rates. He is skeptical AI is reducing labor demand now, but productivity effects and AI capex spending could send interest rates higher ahead.

An internal note from Meta’s metaverse chief has urged employees to use AI to “go 5X faster” by putting AI into codebases and workflows beyond engineering. Meta has joined many tech companies in accelerating AI adoption internally.

AI Opinions and Articles

When Taylor Swift launched an online scavenger hunt with Google to promote her new album, “The Life of a Showgirl,” fans discovered promotional videos that appeared to be AI-generated, sparking controversy among Swifties. This comes despite Swift’s past concerns about AI-generated misinformation, though Google has not confirmed if its AI technology was used.

Top YouTube creator MrBeast is worried about AI’s impact on creators’ livelihoods, despite having dabbled with using the technology himself. He posted concerns about how AI-generated videos could affect millions of creators, expressing fear about current trends. This comes as OpenAI’s Sora 2 and YouTube’s AI tools gain traction, prompting debate on AI’s role and human creativity.

AI is being increasingly used in creating movies, songs, advertising and other content creation forms, whether critics or worriers like it or not. AI changes everything entertainment.

AI Changes Everything

Discussion about this post

Ready for more?