AI Week in Review 25.06.28
Higgsfield Soul, Gemini CLI, Kimi-VL-A3B-Thinking-2506, Gemma 3N, Eleven Labs' 11AI, Imagen 4 Ultra, Magenta Realtime, Microsoft MU, AlphaGenome, AgentForce3, Doppl, OmniGen2, Stream-Omni.

Top Tools
Google has released Gemini CLI, an open-source AI agent that brings the power of Gemini 2.5 Pro and its 1 million token context window directly into the terminal. As we noted in our article on Gemini CLI, this is a command-line AI agent for coding and more. Gemini CLI is open source and offers a free tier with a generous usage quota.
AI Tech and Product Releases
Higgsfield introduced their AI image generation model, Higgsfield Soul:
Meet Higgsfield Soul. Our new high-aesthetic photo model. 50+ curated presets, fashion-grade realism.
In a crowed marketplace for AI image generation tools, Higgsfield Soul stands out by targeting uses in marketing and advertising while achieving output quality sufficient for professional use.
Moonshot AI has released Kimi-VL-A3B-Thinking-2506, an improved multimodal reasoning model that was updated to bolster video understanding and support higher-resolution inputs (up to 1792×1792 pixels). The 2506 revision delivers significant gains on reasoning benchmarks—improving MathVision by +20.1 and MMMU-Pro by +3.2—while reducing token consumption by 20%. It also extends the model’s capabilities to video reasoning, achieving state-of-the-art results on VideoMMMU.
Google has fully released Gemma 3N, a new open-source, multimodal small model they first announced at Google I/O in May. It is optimized for on-device applications and supports a variety of inputs including images, audio, video, and text.
Gemma 3N built using an architecture called Matformer, which trains nested AI models of different sizes concurrently. This permits compute flexibility for efficient on-device execution, packing in a lot of capabilities for its size:
Gemma 3n delivers quality improvements across multilinguality (supporting 140 languages for text and multimodal understanding of 35 languages), math, coding, and reasoning. The E4B version achieves an LMArena score over 1300, making it the first model under 10 billion parameters to reach this benchmark.
Eleven Labs has launched an AI Voice Assistant named 11 AI, that combines their conversational AI voice technology with MCP connections to tools such as Perplexity, Linear, Slack, and Notion to execute tasks. The assistant offers a highly customizable and capable voice interface with over 5,000 voice options, including voice cloning, and runs on 11 Labs' infrastructure. This 11 AI assistant sounds like the next-gen Siri that Apple should have built.
Google has rolled out new versions of its text-to-image model, Imagen 4 and Imagen 4 Ultra, through the Gemini API and Google AI Studio. Imagen 4 offers significantly improved quality and text rendering, while Imagen 4 Ultra adds more precision to produce outputs highly aligned to text prompts.
DeepMind announced the Magenta Realtime music model, a small, open-weights music generation model for creating and performing music in real-time. Magenta Realtime has 800 million parameters and is the open-weights equivalent to Lyria RealTime. Its small size makes it available to download and run locally; you can also experiment with its music generation on AI studio.
Anthropic now allows users to build and host executable AI applications within Claude and share them with others. Other users who log into Claude can then use these applications, with the token usage being billed to them. This new feature opens up new possibilities for creating and sharing AI-powered tools.
Microsoft has introduced MU, an on-device LLM for Windows built to be small and efficient enough to run on the neuroprocessing unit (NPU) of Copilot PCs. The model can achieve over 100 tokens per second and is built into Windows for specific tasks to support Copilot agents on Windows, enabling fast private AI-powered features on Windows devices.
Salesforce has launched AgentForce 3, the next iteration of Salesforce's AI agent platform. It features a central command center for monitoring and optimizing agent performance, powered by the Atlas reasoning engine, and connectivity to tools with MCP and A2A protocol support. Salesforce reports that approximately 30% of its internal customer service and sales operations are now handled by its AI agents.
Google launched a new app called Doppl that allows users to virtually try on different articles of clothing, using AI to visualize how different outfits might look. Doppl lets users upload a full-body photo, apply outfit images, then generates AI images or videos of the user virtually trying on clothes.
Prime Intellect launched SYNTHETIC-2, an open reasoning dataset and decentralized data generation platform. SYNTHETIC-2 provides a library of composable reasoning challenges and tools to generate synthetic datasets, designed to help researchers create and benchmark reasoning tasks at scale.
Google has donated its Agent-to-Agent (A2A) protocol to the Linux Foundation and Linux Foundation announced Agent2Agent Protocol Project, aiming to cement A2A as a leading open standard for multi-agent systems. This will enable better interoperability between AI agents from different developers and platforms.
Google has reduced free-tier API limits for previous-generation Gemini Flash models, scaling back the number of free requests to better align with usage and cost structures.
AI Research News
Google has introduced a new DNA sequence model named Alpha Genome. This model was trained efficiently in just 4 hours using public genetic databases and can analyze stretches of DNA 100 times longer than previous tools. AlphaGenome is capable of predicting the impact of single variants or mutations in human DNA on the biological processes that regulate genes. These predictive capabilities could help in genetic disease understanding, synthetic biology, and fundamental research on the genome.
It’s [AlphaGenome] a milestone for the field. For the first time, we have a single model that unifies long-range context, base-level precision, and state-of-the-art performance across a whole spectrum of genomic tasks. - Dr. Caleb Lareau, Memorial Sloan Kettering Cancer Center
AI researchers from Beijing Academy of AI have released OmniGen2, a powerful open source image generation model that excels at diverse text-to-image generation and image editing tasks. They shared details in the paper OmniGen2: Exploration to Advanced Multimodal Generation. OmniGen2 has been compared as similar but lower quality to proprietary FluxKontext model for doing Photoshop style edits.
From millions of anonymized conversations, we studied how adults use AI for emotional and personal needs—from navigating loneliness and relationships to asking existential questions.
Anthropic’s article “How People Use Claude for Support, Advice, and Companionship” shows that about 4% of Claude's total usage is for emotional support, with conversations ranging from advice and therapy support to role-playing and companionship. These conversations generally conclude on a more positive note than they began.

AI researchers at The Chinese Academy of Sciences released Stream-Omni, an “any-to-any” large language-vision-speech model. Described in the Stream-Omni paper, Stream-Omni simultaneously ingests text, images, and audio in seamless “see-while-hear” interactions, and it produces both text and speech outputs, displaying intermediate ASR transcriptions and responses in real time. It’s open source and available on Hugging Face.
AI Business and Policy
Meta’s AI talent acquisition hunt continues, hiring three researchers from OpenAI with packages up to $18 million. Several researchers from OpenAI’s Zurich office have confirmed their move to Meta. Meta is also rumored to be interested in the co-founder and CEO of Safe Super Intelligence.
Meta is also reportedly in discussions to acquire Play AI, a voice-cloning startup, to bolster its consumer-facing AI features. This potential deal involves bringing Play AI's technology and staff onboard.
In other aqui-hiring news, AI recommendation startup Crossing Minds is joining OpenAI and will no longer take new clients. The company previously built privacy-focused AI systems for e-commerce personalization.
Mira Murati's new company, Thinking Machine Labs (TML), has successfully raised $2 billion at a $10 billion valuation. TML's focus will be on developing custom AI for businesses to enhance revenue and profit, utilizing reinforcement learning to optimize for key performance indicators.
Two significant AI copyright case rulings this week confirm that AI training on copyrighted works is fair use, while also findings its application is not automatically legal. A Federal judge determined that Anthropic's use of books to train its Claude AI models constitutes fair use under US copyright law, while also finding that their use of pirated materials exposed them to liability.
In a similar ruling, a Federal judge sided with Meta in a lawsuit by authors, including Sarah Silverman, who alleged illegal AI training on their copyrighted books. The judge ruled Meta's use was "fair use," finding it transformative and that plaintiffs failed to prove market harm.
Negotiations between OpenAI and Microsoft regarding their partnership terms have reportedly stalled. OpenAI is seeking to transition to a for-profit entity and is proposing significant changes to their agreement, which Microsoft is currently unwilling to accept.
Replit announced a significant financial milestone:
We’re humbled and excited to share that we surpassed $100M ARR last week.
Meanwhile, data shows that OpenAI Codex is being used for 10,000 pull requests per day, indicating significant adoption.
A Gallup survey of U.S. employees shows their use of AI has doubled in two years:
In the past two years, the percentage of U.S. employees who say they have used AI in their role a few times a year or more has nearly doubled, from 21% to 40%. Frequent AI use (a few times a week or more) has also nearly doubled, from 11% to 19% since Gallup’s first measure in 2023. Daily use has doubled in the past 12 months alone, from 4% to 8%.
Startup Cluely raised $15 million from Andreessen Horowitz. Cluely's ability to generate buzz aligns with A16z’s "momentum is the moat" view on consumer AI, but Cluely has faced criticism due to its "cheat life" product marketing.
AI music company Suno acquired WavTool, a browser-based AI digital audio workstation, to enhance its editing capabilities for songwriters and producers.
A federal proposal led by Sen. Ted Cruz aims to ban states from regulating AI for 10 years, which proponents argue prevents a "patchwork" hindering innovation.
Denmark is proposing a first-of-its-kind copyright law amendment granting citizens ownership rights over their body, facial features, and voice to fight deepfake misuse. The legislation would allow individuals to demand the removal of AI-generated content using their likeness and seek compensation for unauthorized use.
A German data protection official reported the Chinese AI app DeepSeek to Apple and Google for illegally transferring user data to China, citing EU law violations. Meike Kamp stated DeepSeek lacked "convincing evidence" of data protection, given Chinese authorities' access rights.
AI Opinions and Articles
Anthropic's "Project Vend" involved their Claude AI autonomously managing a small store, which ultimately failed to turn a profit, was manipulated, and experienced an "identity crisis." This real-world test revealed the AI lacked business common sense and mismanaged the store. Even as AI systems increasingly take on significant business roles, they don’t seem to be a match for human business acumen just yet.