AI Week in Review 25.04.05

Runway Gen-4, Higgsfield AI video, MoCHA, Copilot updates, OpenHands LM 32B, Nomic Embed Multimodal 3B & 7B, Nova Act Agent, Devin 2.0, Zencoder, Augment Agent, Genspark Super Agent, Midjourney V7.

Apr 05, 2025

A person sitting on a pile of sheep

AI-generated content may be incorrect. — Figure 1. Higgsfield AI camera features include a fish-eye lens.

AI Tech and Product Releases

Microsoft announced several Copilot updates at their 50^th Anniversary event. They announced making Copilot a true “AI companion” by giving it memory and personalization, search, actions, vision, Shopping, and Pages.

Microsoft is expanding Copilot Vision beyond web pages to see your screen on Windows and mobile devices, guiding you through apps and able to view real-time video.
Microsoft is launching Copilot Actions, a feature to complete online tasks like booking car rentals, concert tickets, and restaurant dinner reservations. Launch partners include travel services like Booking.com, Open Table, and 1-800-Flowers.
Copilot Pages is a writing canvas to “organize your thoughts and content” with AI assistance.
Clippy, the infamous virtual assistant, made an appearance as Microsoft shared possible Copilot avatars.

OpenHands released OpenHands LM 32B, a SOTA coding-focused AI agent model fine-tuned from Qwen 2.5 32B, alongside an agent optimized for tool use and planning. The open-source model scored 37.2% on SWE-Bench Verified coding benchmark and second place on Live SWE-Bench leaderboard. It can run locally for accessible high-performance coding assistance but a Cloud version is available.

Nomic AI released the Nomic Embed 3B and 7B Multimodal embedding models for visual document retrieval. These models, built on Alibaba’s Qwen2.5-VL, achieve state-of-the-art performance by embedding interleaved text-image sequences for PDFs and complex webpages. The 7B model has open-sourced weights on HuggingFace.

Amazon announced Nova Act Agent, an AI agent platform designed to perform web browser actions using advanced capabilities. Amazon claims Nova Act, powered by Amazon’s Nova LLM, outperforms competitors such as Claude 3.5 and OpenAI’s Operator, sharing benchmark results. Nova Act is in research preview only, with access provided via an SDK for developers who sign up to Amazon Nova.

Amazon is testing a new AI agent based "Buy for Me" feature that uses Amazon's Nova AI and Anthropic's Claude to autonomously purchase items from other sites within the Amazon app, based on user requests. The shopping AI agent is currently being tested with a limited number of users.

AI cliff notes: Amazon's new Kindle "Recaps" feature uses AI to generate plot summaries for book series. These AI-generated recaps aim to help readers recall key details before continuing a series. Concerns about accuracy have surfaced as users begin to use the feature on their Kindle devices.

Beyond Amazon and Microsoft, there are many AI agent developments and releases this week:

Cognition AI announced Devin 2.0, with enhanced collaboration and lower pricing. The updated AI software engineer product features a new integrated development environment (IDE) experience, collaborative task planning, improved code search, and automatic documentation. Devin 2.0 now starts at $20 per month, making its price more competitive with tools like Cursor.
Zencoder unveils next-generation AI coding and unit testing agents. Zencoder integrates directly into popular differentiates itself from competitors like GitHub Copilot by operating within existing developer workflows and offering autonomous "Coffee Mode" functionality.
AugmentCode launched Augment Agent, an AI coding assistant that helps developers manage and modify large, complex codebases. Augment Agent boasts a 200,000 token context window and real-time code synchronization, achieving top scores on the SWE-bench verified benchmark.
Genspark unveiled Super Agent, a fast, autonomous system for real-world tasks. Super Agent uses nine LLMs, 80+ tools, and datasets to handle complex workflows like booking travel, creating videos, and even generating a South Park episode.
Uplimit launched AI agents offering personalized training, progress monitoring, and 24/7 support, to rapidly upskill employees and address skill gaps. Early customers like Procore and Databricks report significant efficiency gains and high course completion rates using Uplimit's AI upskilling.

Midjourney has launched Midjourney v7, with new features like voice-input for prompts and a new "Draft Mode" for faster image generation. V7 has improved image quality and coherence. Initial user reactions are mixed regarding overall improvements, partly because other AI image generation models have caught up and surpassed it in some areas like text generation.

A collage of images of people and animals

AI-generated content may be incorrect. — Figure 2. Midjourney V7 image generations show great aesthetic quality, but other image models surpass in text generation and prompt adherence.

Hailuo AI announced the MiniMax Speech-02 TTS API, a text-to-speech solution that delivers advanced voice cloning and nuanced emotional control. This TTS model is engineered to produce highly realistic synthetic voices suitable for various applications, positioning it as a competitive option for speech synthesis.

Sam Altman announced changes to OpenAI release plans:

We are going to release o3 and o4-mini after all, probably in a couple of weeks, and then do GPT-5 in a few months.

OpenAI also plans to launch an open language model, their first since GPT-2, in the coming months.

Google's NotebookLM can now find its own web sources to summarize and narrate topics. Users can describe a topic and the "Discover" feature gathers and recommends relevant web sources with summaries. Sources can be saved within NotebookLM for citations, research, note-taking, and question-answering.

Genies released user-generated content tools for custom AI avatars. Genies' new tools let anyone create AI avatars, fashion, props, and experiences with interoperability across platforms.

Oumi released HallOumi, an open-source claim verification model to combat AI hallucinations. HallOumi analyzes AI-generated content sentence-by-sentence, cross-referencing against source documents to verify claims, providing detailed analysis of potential inaccuracies.

Articul8, spun out of Intel in 2024, introduced A8-SupplyChain, a series of domain-specific AI models for manufacturing supply chains, along with ModelMesh, an AI-powered orchestration layer.

Sentient released Open Deep Search, an open-source AI search framework. ODS equips LLMs with reasoning agents using web search to answer questions, rivaling proprietary tools like Perplexity and ChatGPT Search.

On April 1, OpenAI introduced a new grumpy “Monday” voice in advanced voice mode for ChatGPT. The date of release was deliberate.

AI Research News

OpenAI has released PaperBench, a benchmark of over 8,300 tasks that evaluates an AI’s ability to replicate machine learning research papers from scratch. They shared results of a study of AI model performance on PaperBench tasks in the paper PaperBench: Evaluating AI’s Ability to Replicate AI Research. In it, the top-scoring AI model Claude 3.5 Sonnet achieved a 21% replication score compared to 41% by human PhDs.

A new study lends credence to allegations that OpenAI trained its AI models on copyrighted content. Researchers developed a method to identify training data "memorized" by OpenAI's models, like GPT-4, by testing for memorization of high-surprisal words. The results suggest that GPT-4 memorized portions of fiction books and New York Times articles.

Hong Kong U researchers released Dream 7B, an LLM built using diffusion rather than autoregression as the architecture. Benchmarks indicate that Dream 7B competes with top 7–8B models and excels on some reasoning tasks like Sudoku. The model weights have not yet been released, and additional details on its architecture have been shared but not published.

AI Business and Policy

OpenAI raised $40B at a $300B post-money valuation. Part of the astonishing valuation is due to OpenAI garnering a huge user base. OpenAI's new image-generation feature is experiencing massive popularity, with over 130 million users generating 700 million images since its March 25 launch.

Microsoft has paused or delayed data center projects in several locations, signaling caution about rapid cloud infrastructure expansion. The move may reflect concerns about demand or construction challenges, despite previous plans to invest heavily in AI data centers.

Anthropic introduced Claude for Education, an AI assistant designed to develop students’ critical thinking skills through Socratic questioning. Perhaps in response, OpenAI is offering free ChatGPT Plus to US and Canadian college students, putting them both in competition to capture the education market.

American Express is using generative AI to improve internal IT support and travel recommendations. Their IT chatbot resolves 40% more queries without human intervention by offering interactive, personalized assistance.

Google released API pricing for Gemini 2.5 Pro, its AI reasoning model. Gemini 2.5 Pro costs $1.25 to $2.50 per million input tokens and $10 to $15 per million output tokens depending on prompt size, a lower price than comparable Anthropic and OpenAI models. Google is increasing rate limits and has indicated a lot of user interest in the SOTA Gemini 2.5 Pro model.

At the same time, Google's accelerated AI model release cadence raises transparency concerns, because they have not published "model card" safety reports for the latest Gemini releases. Google is prioritizing speed over transparency and cites the experimental nature of some releases, despite their heavy use.

AI Opinions and Articles

The dizzying pace of AI release news leaves little space for AI opinions this week. AI is advancing rapidly and changing everything, so it’s best to avoid dug-in opinions. Instead, use AI, learn about AI (like with this newsletter), and if you can, build AI, then let experience inform your conclusions.

Our readership has been steadily growing, reaching1,000 subscribers for this newsletter. So, for all those who got this far, thank you! Thank you for making my efforts worthwhile and helping me reach this milestone. Please share with others and share your feedback on how I can make “AI Changes Everything” Substack better. I welcome your thoughts.

AI Changes Everything