AI Week in Review 25.08.23

NanoBanana, Qwen-Image Edit, DeepSeek V3.1, Seed OSS 36B, Nemotron Nano 9B V2, Gemini Live for Pixel 10, AI Mode in Search goes global, Command A Reasoning, Agents.md, rBio & GPT-4b micro for biology.

Aug 23, 2025

Two men standing on a beach

AI-generated content may be incorrect. — Figure 1. Nano Banana can take 3 separate images, a photo of Google CEO Sundar Pichai, Microsoft CEO Satya Nadella, and a beach scene and put them all together in a hyper-realist shot.

Top Tools - AI Image Editing Goes Bananas

Nano Banana is a new AI image generation model that was released on LMArena and got viral excitement for its high-quality image editing output. Nano Banana has intuitive natural language controls with faithful editing capabilities, able to modify images accurately based on user prompts, such as adding or removing objects, swapping backgrounds, applying various artistic styles, and generating or refining portraits.

A person holding a child

AI-generated content may be incorrect. — Figure 2. NanoBanana can do colorization and restoration of old photos, and its prompt and text adherence make it great for creating marketing copy as well.

The NanoBanana model appears to use volumetric or neural field–based world modeling under the hood to maintain spatial and stylistic consistency in edits. This adds up to an AI prompt-driven tool with Photoshop-level editing, that can maintain existing elements in real images and add realistic elements to it seamlessly.

It’s widely believed to be Google’s next-generation Imagen model, but the only confirmation from Google is Logan Kilpatrick tweeting a banana emoji. Some on X claim it’s from Higgsfield, but we don’t have anything official yet. Nano Banana remains in limited public preview on LMArena, but is also available on some other image generation platforms like Dzine.

In the same vein, Alibaba's Qwen Team released Qwen-Image Edit, an open-source AI model capable of Photoshop-like image editing via text prompts. Built on the foundation of recently-released Qwen-Image, Qwen-Image Edit handles both broad semantic transformations and delicate appearance changes, supporting English and Chinese. It’s available on Hugging Face, via QwenChat through their API.

A collage of a person

AI-generated content may be incorrect. — Figure 3. Qwen-Image Edit can be used for Virtual Try-On uses, background edits and text additions, so it’s a useful model for marketing content generation and customized use cases.

AI Tech and Product Releases

DeepSeek has released DeepSeek-V3.1, a hybrid thinking model that offers more efficient reasoning, improved reasoning for search, and better tool calling and agentic capabilities. The hybrid thinking mode can toggle between “thinking” and “non-thinking” modes for performance and efficiency optimization. The model outperforms DeepSeek’s prior R1 model while using fewer thinking tokens, e.g., 66% on SWE-Bench, 76% on Aider Polyglot.

DeepSeek-V3.1 is an open source mixture-of-experts (MoE) model with 671B total parameters and 37B active parameters; it has128K context length, and is optimized for upcoming domestic Chinese chips using an FP8 precision format. Both DeepSeek-V3.1-Base and DeepSeek-V3.1 are available on HuggingFace.

A graph of a number of blue bars

AI-generated content may be incorrect. — Figure 4. V3.1-Think can beat R1 on benchmarks, but more importantly is much can do it with far fewer tokens, making its reasoning far more cost-efficient.

ByteDance’s Seed team released Seed OSS 36B, a trio of open-source 36B dense models consisting of two Base variants (with and without synthetic data) and an Instruct model. These were trained on 12T tokens and built for long-context (native half-million tokens) and agentic use. They include a “thinking budget” control you can set in 512-token increments so you can trade depth for speed.

Seed OSS 36B delivers competitive performance on benchmarks, outperforming comparable-sized models such at Qwen 3 32B. Seed-OSS-36B-Instruct is available on HuggingFace under the open source Apache 2.0 license.

Nvidia released Nemotron Nano 9B V2, a fast 9B mixed-architecture (hybrid Mamba-Transformer) AI model that is open source including base, pre-alignment/pruning versions, and a realigned reasoning model, alongside extensive transparency into the training data (approximately 6.6 trillion tokens covering web, math, code, and SFT). The Nemotron Nano 2 Tech Report The open release and data access facilitate reproducibility and fine-tuning by developers.

Googe rolled out major Gemini upgrades across Pixel 10 and Android. These include “Gemini Live” visual guidance that lets the assistant see through your camera and guide tasks in real time, real-time translation, and Gemini-powered photo editing. Pixel 10 hits shelves August 28 with broader Android/iOS rollout of AI features following later.

Google is expanding AI Mode in Search feature globally to 180 new countries in English. It now includes agentic capabilities for restaurant reservations for Ultra subscribers and personalized search results based on user preferences.

Google announced an enhanced Drive video editing experience with a new shortcut button for Vids, its AI-powered video-creation tool. Workspace users can now directly open videos from Drive into Vids to trim clips, add music, and incorporate text.

Cohere released Command A Reasoning, a new enterprise-focused LLM supporting 256,000 tokens and 23 languages with strong reasoning and tool-use capabilities. Command A Reasoning is commercially available and offers a “token budget” feature for customized reasoning depth and secure deployment via the Cohere North platform.

OpenAI introduced Agents.md, an open-source Markdown-based standard for documenting agent configurations, instructions, and setup directly in a repository root. The Agents.md format aims to standardize the agent configuration ecosystem (MCP server manifests, tool schemas, system guidance) across agent frameworks, enabling tools to detect and use documented instructions automatically. This appears to be well on its way to adoption, as multiple platforms including OpenAI Codex, Amp, Jules, Cursor, RooCode, and many others have already adopted it.

AI Research News

IBM and NASA have released an AI model for predicting solar weather named Surya:

The first helio-physics AI foundation model trained on high resolution solar observation data offers insights into the Sun's dynamic surface, helping plan for solar weather that can disrupt technology on Earth and in space.

OpenAI published “Accelerating life sciences research”, describing how their AI models are supporting biology research. They shared how GPT‑4b micro, a version of GPT‑4o specialized for protein engineering, has been successfully applied to AI-guided protein design for stem cell reprogramming research, with applications for cell rejuvenation therapies.

The Chan Zuckerberg Initiative (CZI) launched rBio, the first AI model trained to reason about cellular biology using virtual simulations rather than expensive lab experiments. The rBio model employs “soft verification” and reinforcement learning, allowing researchers to computationally test biological hypotheses and significantly accelerate drug discovery.

AI Business and Policy

Apple is reportedly in talks to use Google Gemini to enhance Siri’s AI capabilities, as Apple remains behind competitors in offering compelling AI applications.

Meta is partnering with Midjourney to license its AI image and video generation technology, aiming to integrate its creative aesthetic capabilities into future models and products to “bringing beauty to billions.” Meta researchers will collaborate directly with Midjourney to integrate capabilities. Meta stressed that Midjourney will maintain its independence as part of the agreement.

Meanwhile, Meta has frozen hiring in its AI organization after restructuring the unit earlier this week. This freeze follows weeks of poaching over 50 AI researchers and engineers from competitors and a reorganization of its AI unit.

Apple's upcoming September software updates will equip businesses with granular control over employee AI usage, allowing IT administrators to configure access to external AI providers, including an enterprise version of OpenAI's ChatGPT.

Nvidia halted H20 AI chip production after Beijing reportedly warned Chinese companies against their use due to security concerns. Nvidia denies backdoors, stating the market can use H20 chips confidently, but China is encouraging domestic chip use, undermining Nvidia’s comeback in China’s AI market.

Anthropic is consolidating its enterprise AI offerings by integrating Claude Code into its Claude Enterprise and Teams subscriptions, alongside enhanced admin controls and a new Compliance API for improved governance. The update addresses previous individual usage limits and provides businesses with granular spending controls and better observability.

Coinbase CEO Brian Armstrong fired engineers who refused to adopt AI coding assistants like GitHub Copilot. Armstrong issued a company-wide mandate, holding a meeting for non-compliant staff. Those without valid reasons for not onboarding were terminated, sending a clear message that AI adoption is mandatory.

FieldAI, a robotics startup, has raised $405 million, including a recent $314 million round co-led by Bezos Expeditions. The company develops "foundational embodied AI models," essentially robot brains, designed to help various robots adapt safely to new environments. These models integrate physics, allowing robots to quickly learn, manage risk, and make confident decisions in real-world settings.

AI Opinions and Articles – The Summer Lull

The post-GPT-5 lull of AI release activity has seen AI go on vacation. While the AI builders take a late summer break, there’s been a vibe shift, with more talk of AI malaise, bursting AI bubbles and end of scaling.

AI skeptics have been chattering and the media reports headlines like “MIT report: 95% of generative AI pilots at companies are failing.” Actually, the study is flawed and limited and didn’t actually say that 95% of AI pilots failed but more that return on AI investments have yet to be realized. VentureBeat explains that Enterprise AI Isn't Failing, It's Quietly Succeeding Through Shadow AI. The study actually found that most of the ‘value’ of AI has accrued to the individuals using them as personal tools.

One narrative is that GPT-5 release was apparently ‘botched’. Sam Altman himself said OpenAI ‘totally screwed up’ its GPT-5 launch. Ewan Morrison describes the narrative shift:

2 weeks ago: AGI is coming.
1 week ago: Chat GPT-5 is not AGI, it's a flop.
This week: We're in an AI bubble.

Ewan Morrison is one of those AI skeptics eager to deflate AI hype and beat the drum for a downbeat narrative of AI failing:

Disappointment with Chat GPT5 has burst the AI hype bubble. … Even AI pushers & influencers are now saying LLMs are on a plateau. 2 years & billions spent on tweaks is not AGI.

I believe this mood swing is just the seasonality of the AI hype cycle. We had similar lulls and questioning of AI progress in the prior two Augusts. We asked “End of AI’s Summer?” as AI apparently stalled in August 2023, and then we got multi-modal GPT-4 within a month. There was “Is scaling over?” talk in August last year; then the o1 model burst forth in September 2024. This too shall pass.

Microsoft's AI CEO, Mustafa Suleyman, calls “AI welfare” study dangerous, fearing it exacerbates human issues and creates social division. In a blog post called “We must build AI for people; not to be a person,” he argues that advocating for AI rights and “model welfare” is misleading and a dangerous turn that could distort public understanding. He emphasized that AI should be treated as tools, not beings, warning about the risks of anthropomorphism.

The arrival of Seemingly Conscious AI is inevitable and unwelcome. Instead, we need a vision for AI that can fulfill its potential as a helpful companion without falling prey to its illusions. – Mustafa Suleyman

AI Changes Everything

Discussion about this post