AI Week in Review 25.10.18
Claude Skills, Claude Haiku 4.5, Veo 3.1, Sora 2 update, DGX Spark, World Labs RTFM, Microsoft Windows 11 Copilot, Grounding with Google Maps, Pinterest fights AI slop, ChatGPT loosens up w erotica.

Top Tools
Anthropic introduced Agent Skills, aka Claude Skills, a way to package custom instructions and workflows that Claude can load on demand when relevant, enabling specialized workflow steps without bloating context. Skills encode team-specific work processes (for coding, report formats, or analysis) and are manageable via settings and an SDK. These skills are composable and portable, so they can be stacked into custom workflows and used across Claude apps, Claude Code and their API.
Anthropic published implementation guidance alongside the announcement, including instructions on creating custom Skills, as well as providing pre-built Skills, such as handling Excel and PowerPoint, in a Skills Repository. Skills extend agentic capabilities for AI models in a way complementary to MCP, and since the Skills repo is open-source, it could potentially gain traction like MCPs have.
Skills are available as a feature preview for users on Pro, Max, Team, and Enterprise plans.
AI Tech and Product Releases
Anthropic launched Claude Haiku 4.5, emphasizing its use as a fast, cost-efficient AI model for scaled sub-agent deployments and extended thinking. Haiku 4.5 is excellent on agentic and coding task benchmarks, getting 73.3% on SWE-bench verified and 41% on Terminal-bench, performing close to Claude Sonnet 4 overall. The performance, speed and reasonable cost positions Haiku 4.5 as a useful model for agentic AI. Anthropic stated:
Sonnet 4.5 can break down a complex problem into multi-step plans, then orchestrate a team of multiple Haiku 4.5s to complete subtasks in parallel.
For developers building AI agents on Claude model through an API, AWS added the latest Claude models - Opus 4.1, Sonnet 4.5, Haiku 4.5 - to Amazon Bedrock. API pricing for Haiku 4.5 is $1/$5 per million input and output tokens.
Google released Veo 3.1 and Veo 3.1 Fast, adding richer native audio, improved narrative control, reference-image guidance, and scene extension for longer videos. Veo 3.1 introduces first-to-last-frame transitions and an ‘ingredients’ feature that can blend multiple reference images into a video. Reviews of Veo 3.1 credit it with enhanced audio quality, better image-to-video adherence, and better consistency and control in video generation. Veo 3.1 is in paid preview via the Gemini API and available through AI Studio, Vertex AI, the Gemini app, and Flow.
OpenAI made a Sora 2 update, increasing default Sora 2 video length to 15 seconds and enabled 25-second generations, plus launching storyboards for Pro users. Storyboards let you sketch out your video second by second, enabling closer control on video generations.
Sora 2 AI video generations have also created a backlash around copyright issues and depictions of public figures. OpenAI paused the generation of Sora 2 videos resembling Martin Luther King Jr., following a request from his estate due to “disrespectful depictions.” The company stated public figures and their families should control their likeness in AI-generated content.
Baidu’s MuseStreamer now has over 20-second video generations. MuseStreamer supports real-time interactive long-form video generation, enabling greater control for users to rewrite storylines or extend transitions in video generation.
World Labs released RTFM, Real-Time Frame Model, a real-time generative world model that renders 3D-consistent scenes interactively. The RTFM blog post lays out the model’s aims – efficiency, scalability, and persistence (consistency over time) – and shows how RTFM accomplishes the feat of putting and intensive world model on a single H100. Their public demo lets users explore generated worlds in real time. RTFM is still a toy, not a tool, but its getting closer to real-world utility.
OpenAI updated memory management in ChatGPT. OpenAI is automatically managing memory in ChatGPT to address “memory full” issues, with the aim to improve reliability and reduce user-facing memory limits.
Microsoft is “making every Windows 11 PC an AI PC” with Windows 11 Copilot voice input (OS-level agent). Microsoft announced a new Copilot agent integrated at the OS level in Windows 11, enabling background, voice-driven task execution in a secure sandbox. The feature targets agentic workflows that operate while users continue other work.
The magic unlock with Copilot Voice and Copilot Vision is the ease of interaction. Using the new wake word, “Hey Copilot,” getting something done is as easy as just asking for it. And with your permission, Copilot Vision can analyze what’s on your screen helping you learn new apps, get recommendations for a project or show you how to do it.
Facebook is rolling out a new feature to help users select the best pictures and videos from their camera roll and share them. This AI-powered automation saves time and effort in curating posts or Stories by offering a solid starting point.
Following backlash over an increase in AI slop, Pinterest added new tools allowing users to limit how much AI-generated content they see. Users can now personalize feeds to restrict GenAI imagery in select categories like beauty and art, with more prominent content labels. Controls that empower users to tailor their AI exposure are a great idea that should be picked up by other social media platforms,
Google added “Grounding with Google Maps” to the Gemini API, enabling developers to ground AI model responses in up-to-date Google Maps geospatial data. This can help with detailed localized questions, and Maps grounding can be combined with Search grounding to further improve factuality. A Google AI Studio demo is available for remixing.
Google has updated their AI Studio for an improved developer and user experience. They introduced a single, unified Playground workspace for Gemini, generative media models, test-to-speech and Live models. They also made AI model switching easier and refined the Chat UI for better consistency.
NVIDIA has started shipping the DGX Spark personal workstation, a compact personal AI supercomputer that combines for local inference and prototyping. The DGX Spark desktop workstation quickly sold out, positioned as An independent deep-dive from LMSYS reviewed its specs and performance:
On the GPU side, the GB10 delivers up to 1 PFLOP of sparse FP4 tensor performance, placing its AI capability roughly between the RTX 5070 and 5070 Ti. The standout feature is its 128 GB of coherent unified system memory, shared seamlessly between the CPU and GPU. This unified architecture allows the DGX Spark to load and run large models directly without the overhead of system-to-VRAM data transfers.
With that unified memory, DGX Spark can run far larger AI models than any other desktop PC.
OpenAI CEO Sam Altman said they will loosen ChatGPT restrictions (in an age-gated way) and ChatGPT will support mature/erotica content for verified adults starting in December. The intent is to have more customizable personalities reminiscent of GPT-4o’s style, and to treat adult users differently from minors while maintaining safeguards.
Cognition launched SWE-grep and SWE-grep-mini, two RL-trained, multi-turn context-retrieval AI agentic models designed for AI code search. The models surface relevant code rapidly for agents, powering a new Windsurf “Fast Context” subagent, and they use high parallelism to reduce search latency dramatically.
AI Research News
Google introduced DeepSomatic, an open-source AI model to classify cancer variants. DeepSomatic is an open-source model that differentiates inherited versus somatic variants (notably indels) and is reported to outperform existing methods on complex samples.
Google released the Cell2Sentence-Scale (C2S-Scale) 27B, a Gemma-based AI model for cancer research, developed in a collaboration with Yale. The Cell2Sentence (C2S) framework represents RNA sequence information as textual “cell sentences,” and the C2S-Scale model was trained over one billion tokens of transcriptomic and biological data. Researchers used C2S-Scale for single-cell analysis to identify a potential cancer-therapy pathway that was validated in wet-lab experiments. Research paper pre-print and model have been released.
Google DeepMind has partnered with Commonwealth Fusion Systems (CFS) to accelerate fusion energy development through learned plasma control. Their AI, utilizing deep reinforcement learning and the TORAX simulator, optimizes plasma control and tokamak performance to bring clean, limitless fusion power closer to reality.
AI Business and Policy
OpenAI announced a partnership with Broadcom to co-develop custom AI chips. OpenAI will design accelerators and systems, with Broadcom developing and deploying them starting in H2 2026; the program targets 10 GW of capacity to support future frontier models and inference at scale. The move OpenAI–Broadcom collaboration is a shift to open infrastructure choices by using Broadcom’s Ethernet diversifies OpenAI’s compute stack and gives them more capacity, and highlights the development of specialized AI chips for AI inference.
Nvidia said Meta and Oracle will standardize on using Spectrum-X Ethernet for AI data center networks, citing training efficiency and scale as drivers for choosing the open, accelerated Ethernet architecture.
Spotify announced deals with major record labels to develop new “responsible” AI products that aim to ensure fair artist compensation, respect copyrights, and allow artists to choose if their music is used with AI tools. The company is investing in an AI research lab to build technologies that center on artist choice and compensation.
Apple unveiled the M5 chip, with claims of 4x peak AI performance over M4, targeting next-gen on-device AI workloads. The M5 will be integrated into new MacBook Pro, iPad and Vision Pro releases, with availability in coming weeks.
California has enacted AI online-safety laws, including AI chatbot disclosure for minors. Gov. Newsom signed SB243, requiring AI chatbots to clearly disclose that they are AI in contexts where users (especially minors) might mistake them for humans. California’s just-passed AI transparency law will also come into effect soon; Legal firm Mayer Brown note explains the AI Transparency Act and its compliance obligations.
AI Opinions and Articles
The essay “AI is reshaping business” by Jared Spataro, Microsoft CMO of AI at Work, outlines how organizations are reorganizing around AI as leading “frontier” AI-era firms become human-led and AI-operated. He makes key points on what the impact of AI on business organization looks like:
The cost of specialization collapses: “Agents grounded in a firm’s specialized knowledge of a product, market, or function can be spun up quickly and plugged directly into the organization’s data, systems, and guardrails.”
Work is redesigned for human–agent collaboration: “As digital workers are woven into every function, the context shifts—from a world designed for humans to process information, to one optimized for agents.”
Knowledge compounds like interest: “Agents create a new kind of knowledge loop at a speed no human system can match.”
An earlier essay “The CEO’s guide to building a Frontier Firm” explains further what it means to be AI-first, including what human-led and AI-operated means. Putting this together is a vision of radical business organizational change as AI gets adopted in the enterprise. His vision is accurate.




This article comes at the perfect time, realy, as the conversation around specialized AI agents and their practical deployment is picking up so much momentum right now. I'm particularly intrigued by the potential synergy between Claude Skills and Haiku 4.5 for building robust sub-agent architectures; do you foresee these tools democratizing advanced agentic AI significantly, and thank you for always providing such a sharp and insightful analysis?