Daily Brief

Generated each morning. Top AI stories of the day, categorised.

2026-05-17 2026-05-16 2026-05-15 2026-05-14 2026-05-13

DeepSignal — 2026-05-13

Today's 20 highest-signal stories across 3 verticals, curated by DeepSignal.

Finalised. Subscribers will receive this shortly.

20 stories3 verticals

Today's Highlights

01OpenAI launches Codex Cloud Agent for autonomous engineering tasks
OpenAI released Codex Cloud Agent, a sandboxed coding agent that autonomously runs multi-step engineering tasks like refactors, tests, and PRs.
02Claude Sonnet 4.5 leads SWE-Bench Verified at 64.2%
Claude Sonnet 4.5 jumps SWE-Bench Verified to 64.2% and adds a 200K-token context option.
03Meta open-sources Llama 4 Vision — outperforms GPT-4o on chart QA

Today by Vertical

Robotics

Recent advancements in robotics highlight significant developments in both AI capabilities and investment trends. DeepMind's Gemini-Robotics has demonstrated its ability to perform kitchen tasks such as pouring, plating, and unloading dishwashers with zero-shot learning across two previously unseen robot bodies, showcasing the potential for generalization in robotic manipulation tasks DeepMind shows Gemini-Robotics generalises to unseen kitchen tasks zero-shot. Concurrently, OpenAI has invested $50 million into ten early-stage robotics startups, focusing on areas like humanoids, manipulation, and tactile sensing, indicating a robust interest in fostering innovation within the robotics sector OpenAI invests $50M in 10 robotics startups via new fund. This convergence of advanced AI capabilities and substantial funding presents a promising landscape for builders and investors in the robotics field.

Papers

Recent advancements in AI models showcase innovative methodologies for enhancing performance and capabilities. The concept of Self-Rewarding Reasoning, as discussed in this paper, demonstrates that a single LLM can generate, evaluate, and refine its own reasoning chains, resulting in a notable improvement of 6.4 points in MATH scores after three iterations. In parallel, the development of Stable-Video-3D, detailed in this article, enables the generation of 8-second 1080p videos from text prompts, ensuring that the motion adheres to realistic physical dynamics. Together, these innovations highlight the potential for self-improving systems in both reasoning and multimedia generation, indicating significant opportunities for builders and investors in the AI landscape.

Today's Observations

OpenAI's Codex Cloud Agent automates coding tasks, increasing productivity for developers and reducing time-to-market.
Claude Sonnet 4.5's 64.2% SWE-Bench score indicates rising competition in AI coding tools, impacting developer choices.
Meta's Llama 4 Vision outperforms GPT-4o, suggesting a shift in preference towards open-source AI models for image tasks.
DeepMind's Gemini-Robotics shows zero-shot generalization, enhancing robotics capabilities and reducing training costs for operators.
Cursor's $500M ARR highlights strong enterprise demand for AI coding tools, signaling a lucrative investment opportunity.
Anthropic's Constitutional AI v3 reduces refusal rates by 41%, improving task efficiency for enterprise AI applications.
OpenAI's $50M investment in robotics startups indicates a growing focus on innovative robotics solutions, essential for future tech landscapes.