News
The latest from the AI agent ecosystem, updated multiple times daily.
AI Was Ruining My Philosophy Class. So We Wrote One Essay Together
When AI made traditional philosophy essays unreliable, a University of Chicago professor tried something unusual: writing one with his entire class. The collaborative experiment worked. Students said they worked harder, learned more, and were finally doing real philosophy instead of pretending for a grade.
Hydra: Swap AI coding CLIs mid-session when rate limits bite
Hydra is a unified wrapper for AI coding CLIs that switches between providers like Claude Code, Codex, OpenCode, and Pi when hitting rate limits. It automatically manages context transfer through clipboard, letting developers maintain workflow without manual context copying or re-explanation.
Anthropic Quietly Drops Claude Code from $20 Pro Plan
Anthropic has removed Claude Code from its $20/month Pro plan for new signups, with support docs now referencing only the Max Plan. The company calls it 'a small test on ~2% of new prosumer signups,' but documentation changes suggest something broader. Current Pro subscribers still have access through the web app and CLI, for now.
Taskd lets Claude manage its own task queue
A task management system built by Levi Durfee that integrates with Claude AI via an MCP server. Written in Go and TypeScript, deployed on Google Cloud Run with Cloud SQL and envelope encryption via GCP KMS. Claude can autonomously pick up tasks, create plans, leave comments, change status, and even add suggestions to the task list.
SpaceX Bets $60B That Cursor's AI Edge Outweighs Its China Problem
SpaceX has agreed to buy AI code editor Cursor for $60 billion, per Bloomberg. The deal includes a $10B breakup fee and raises immediate questions about Chinese AI dependencies in a company handling classified US payloads.
Mozilla Squashed 271 Firefox Bugs Using Anthropic's Mythos
Mozilla used Anthropic's Mythos Preview AI model to identify and fix 271 vulnerabilities in Firefox 150, gained through direct collaboration with Anthropic. The find shows AI can now catch bugs that previously required expensive human analysis. But the approach raises questions about access: most open source projects lack the resources and connections that made this possible.
CrabTrap: When your AI security guard is another AI
CrabTrap is an open-source HTTP proxy from Brex that secures AI agents in production by intercepting requests, evaluating them against policies, and allowing or blocking them in real time. It combines static rule matching with LLM judgment to make security decisions.
SpaceX buys Cursor for $60B. Users' code may be the real prize.
SpaceX is acquiring AI code editor Cursor for $60B. The real prize is the massive repository of developer code that could feed xAI's training pipeline.
OpenAI quietly drops Euphony, a debugging lens for Codex sessions
Euphony visualizes chat data and Codex session logs, giving developers a free way to debug AI interactions instead of guessing what went wrong.
GitHub Copilot pauses sign-ups as agentic compute costs surge
GitHub paused new sign-ups for Copilot Pro, Pro+, and Student plans as agentic workflows push compute costs past what flat-rate pricing can sustain. Opus models drop from Pro plans, billing shifts to per-token, and usage limits now show in VS Code and CLI.
Anthropic's Mythos Model Leaked to Unauthorized Users on Launch Day
A private online forum group gained unauthorized access to Anthropic's Mythos model the same day it was announced for limited testing. The group has used the model regularly since, corroborated by screenshots and a live demo.
The zero-days are numbered
Mozilla used Anthropic's Claude Mythos Preview to find 271 security vulnerabilities in Firefox 150. Opus 4.6 previously caught 22 bugs in Firefox 148. The 12x increase in detection raises questions about whether AI is shifting the advantage from attackers to defenders.
Passive Contact Lens Treats Glaucoma Without Electronics
Researchers at the Terasaki Institute for Biomedical Innovation have developed a smart contact lens with zero electronics that uses microfluidics to monitor glaucoma and automatically deliver drugs when eye pressure climbs. A smartphone app with a convolutional neural network reads pressure levels with 94% accuracy. Testing on rabbits showed effectiveness comparable to eye drops with no biocompatibility issues over 14 days of use.
Mozilla Let Anthropic's AI Loose on Firefox. It Found 271 Bugs.
Firefox 150 shipped this week with fixes for 271 security bugs found entirely by Anthropic's Mythos AI. Automated tools can now catch what previously required human analysis. Mozilla says that's good news for Firefox users, but signals a rough transition for everyone else.
Some secret management belongs in your HTTP proxy
AI agents given direct access to API keys create security headaches. Some models refuse requests with visible secrets, others store keys in memory across sessions. The fix is an HTTP proxy that intercepts requests and injects authentication headers, so agents never touch the actual credentials. exe.dev's Integrations feature automates this pattern, including a GitHub App for OAuth.
Claude Code pulled from Pro, now costs individuals 5x more
Anthropic has removed Claude Code from its $20/month Pro subscription, calling it a 'test' despite updating its pricing page. Journalist Ed Zitron spotted the change on April 21st. The coding assistant remains available on Team plans, which require a minimum of 5 seats at $20 each, making the cheapest path $100/month. Users have also reported severe throttling issues with Claude Opus recently.
MCPorter Makes MCP Servers Actually Callable
MCPorter is a TypeScript toolkit for the Model Context Protocol that discovers your existing MCP servers and generates typed clients and CLI wrappers from them. It handles OAuth, supports HTTP, SSE, and stdio transports, and lets you call MCP servers directly from TypeScript or the command line.
$60B for a Code Editor? SpaceX Is After the Data
SpaceX has stated it holds an option to acquire Cursor, an AI code editor startup, for $60 billion, according to a Reuters report.
Kuri's 464KB browser agent beats Vercel's on token cost
Kuri is a browser automation and web crawling tool written in Zig, designed specifically for AI agents with a focus on token efficiency. It features a 464 KB binary with ~3ms cold start, and claims 16% lower workflow token cost compared to Vercel's agent-browser according to project benchmarks. It offers CDP automation, accessibility snapshots, HAR recording, a standalone fetcher mode (kuri-fetch) that doesn't require Chrome, and an interactive terminal browser.
Firefox 150: Mythos AI Caught 271 Bugs Before Ship
Anthropic's Mythos Preview AI found 271 security vulnerabilities in Firefox 150 before release, a sharp jump from the 22 bugs caught by Anthropic's Opus 4.6 in Firefox 148. Mozilla's Bobby Holley called Mythos "every bit as capable" as the world's best security researchers, while Mozilla CTO Raffi Krikorian warned that open source maintainers still lack access to such tools.
TSRX Compiles Cleaner UI Code for Humans and AI
TSRX is a TypeScript language extension for building declarative UIs designed for what creator Dominic Gannaway calls the 'agentic era.' It keeps structure, styling, and control flow co-located and readable while remaining fully backward compatible with TypeScript. The compiler targets React, Solid, and Ripple, handling framework-specific patterns like React hooks rules automatically to improve code readability for engineers and AI systems alike.
Almanac Builds Wiki Where AI Drafts, Humans Verify the Long Tail
Almanac is a wiki platform where users build knowledge bases using AI tools like Claude, ChatGPT, Cursor, and Codex. An MCP extension turns Claude Code into a research agent that drafts and submits entries. A CLI handles terminal-based contributions. Articles are attributed to contributors and opened for community edits, covering niche topics traditional encyclopedias miss.
No-Mistakes: AI Validation Before Git Push
No-mistakes is a local git proxy that implements an AI-driven validation pipeline before pushing to remote repositories. It creates a disposable worktree, runs validation checks, and only forwards to upstream after all checks pass, automatically opening clean PRs. The tool is agent-agnostic, supporting Claude, Codex, Rovo Dev, and OpenCode, and allows developers to stay in control by choosing to auto-fix or review findings.
Grafana 13 adds an AI assistant, but won't say what's under the hood
Grafana 13 introduces the Grafana Assistant, an AI agent that helps build dashboards, customize templates, and work through SQL expressions, though Grafana won't disclose what model powers it. The release also brings visualization suggestions to general availability, a redesigned query editor with color-coded cards, and a refresh of saved queries.
Cloudflare's 3,700 Engineers Now Run on Their Own AI Stack
Cloudflare reveals how 93% of their R&D organization uses AI coding tools powered by their own infrastructure, including AI Gateway for routing, Workers AI for inference, MCP Server Portal for tool access, Code Mode sandbox for safe code execution, and AI Code Reviewer integrated with CI pipelines.
Mediator.ai runs Nash bargaining through an LLM to resolve disputes
Mediator.ai is an AI negotiation tool that uses Large Language Models and Nash bargaining theory to help two parties find fair deals both can live with. Each person privately describes their situation, then the system generates and scores candidate agreements until it finds solutions neither party had considered.
Codemix ships a graph DB that type-checks queries and syncs via CRDTs
@codemix/graph is an open-source TypeScript graph database built on CRDT technology, featuring type-safe schema definition with Zod/Valibot/ArkType, Gremlin-style traversal API, Cypher-like query language support, and Yjs-based offline-first collaboration. It enables LLMs to execute structured queries against graph data.
The blurry JPEG gets a name: expansion artifacts
An opinion piece introducing 'expansion artifacts,' the term for hallucinations, style issues, and strange outputs that appear when LLMs generate content. Unlike compression artifacts, these occur during decompression, when models extrapolate from compressed training data. The article examines examples in text, code, images, and video, and warns of risks when AI-generated content feeds into new AI generations, creating feedback loops that flatten and worsen information quality.
AI's Code Overload: When Your Brain Becomes the Bottleneck
Dave Rupert identifies a growing problem among developers using AI coding tools: cognitive overload from code that outpaces human understanding. Drawing on manufacturing principles from Goldratt's 'The Goal,' he argues AI-generated code creates excess 'inventory' that the 40-watt human brain struggles to process, leading to what researchers call 'cognitive debt.'
LeCun's Bet Pays Off: Lean World Model Plans 48x Faster
A new paper from Yann LeCun and Meta researchers introduces LeWorldModel (LeWM), a Joint Embedding Predictive Architecture (JEPA) that trains stably end-to-end from raw pixels using only two loss terms. With approximately 15M parameters trainable on a single GPU in a few hours, LeWM plans up to 48x faster than foundation-model-based world models while remaining competitive across diverse 2D and 3D control tasks. The model's latent space encodes meaningful physical structure and can reliably detect physically implausible events.
Anthropic hikes Claude Code to $100/month as quality drops
Anthropic removed Claude Code from the $20/month Pro plan and now requires a $100/seat/month Team Premium seat. The change coincides with documented quality regression tied to a February update, with AMD engineer Stella Laurenzo's analysis of over 6,800 sessions showing the assistant began ignoring instructions and hallucinating fixes. Users on Hacker News expressed frustration, with some considering competitors like GLM and Kimi or exploring local models.
$5B In, $100B Out: Anthropic's Decade-Long AWS Gamble
Anthropic announced a fresh $5 billion investment from Amazon, bringing Amazon's total investment to $13 billion. In exchange, Anthropic committed to spending over $100 billion on AWS over the next 10 years to train and run its Claude model. The deal centers on Amazon's custom AI accelerator chips (Trainium2 through Trainium4), with Anthropic securing options for future chip capacity. The commitment includes 5 GW of compute, roughly the power consumption of Houston.
Meta logs workers' keystrokes and clicks to train AI agents
Meta is installing tracking software on US employees' computers to capture mouse movements, clicks, and keystrokes for AI model training. The data will help build agents that can perform computer tasks autonomously, learning how humans interact with dropdown menus, keyboard shortcuts, and other interface elements. Spokesperson Andy Stone says the data won't be used for performance reviews.
Zindex Wants to Be the Database for Agent Diagrams
Zindex is a diagram infrastructure platform designed for agents and agentic systems, featuring the Diagram Scene Protocol (DSP) that enables agents to create, edit, validate, and render diagrams as durable state. Key features include semantic descriptions (not geometric), built-in Sugiyama-style hierarchical layout pipeline, incremental editing with stable IDs, multiple render targets (SVG, PNG with 4 themes), deterministic execution, 40+ validation rules, and PostgreSQL storage. It serves as the middle layer between agent reasoning and visual output.
gpt-image-2 trades speed for architectural ambition
OpenAI's latest image model abandons diffusion for token-based generation. The architectural shift enables native multimodal processing but introduces new trade-offs: slower rendering and compositional blind spots. Early tests show impressive fidelity but the same old workflow gaps remain.
HAE Summarizes KV Cache Tokens Instead of Pruning, Cuts Error 3x
A technical research post introducing HAE (Hierarchical Attention Entropy), a new approach to KV cache compression for long-context LLMs. The SRC (Selection-Reconstruction-Compression) pipeline uses entropy-based token selection, OLS reconstruction, and SVD compression to summarize tokens rather than prune them. Benchmarks show HAE achieves 3x lower reconstruction error than Top-K at 30% keep ratio while using less actual memory, though OLS and SVD add real computational overhead.
Amazon Bets $25B on Anthropic in Silicon Play Against NVIDIA
Amazon will invest up to $25 billion in Anthropic, securing a commitment from the AI startup to spend $100 billion on AWS over the next decade. The real story is silicon: Anthropic is shifting toward Amazon's custom Trainium chips, taking aim at NVIDIA's grip on AI training infrastructure.
Daemons Clean Up the Mess AI Agents Leave Behind
Charlie Labs introduces Daemons, AI background processes that maintain codebases by keeping PRs mergeable, updating documentation, triaging bugs, and managing dependencies. Daemons are self-initiated, defined in simple .md files, and continuously fix drift in codebases, PRs, issues, and docs.
Musk Gets Criminal Summons as France Raids X Over Grok Deepfakes
French cybercrime investigators raided X's Paris headquarters and summoned Elon Musk to appear in April for questioning over Grok's generation of deepfake nude images (including those depicting children), antisemitic content, and child pornography. The criminal investigation also covers hate speech and fraudulent data extraction. UK and EU regulators have opened similar probes into Grok, while US state attorneys general are demanding changes to stop nonconsensual sexualized images.
GoModel Shrinks LiteLLM's Footprint 44x with a Go Rewrite
GoModel is an open-source AI gateway that puts 10+ LLM providers behind one OpenAI-compatible API. Written in Go, it claims to be 44x lighter than Python-based LiteLLM while handling caching, cost tracking, and multi-provider routing in a single small container.
gpt-image-2 drops diffusion for transformers, tops image arena
ChatGPT launches Images 2.0, an upgraded AI image generation feature with advanced text-heavy composition capabilities, expanded editing tools, high-resolution outputs up to 4K, and flexible format support for infographics, posters, comics, and social content.
LILYGO T-Watch Ultra Is a DIY Watch You'd Actually Wear
LILYGO has launched the T-Watch Ultra, a rugged IP65-rated DIY smartwatch powered by an ESP32-S3 microcontroller with 16MB flash and 8MB PSRAM. The device features a 2.01-inch AMOLED display, 1100mAh battery, WiFi, Bluetooth 5.0 LE, LoRa transceiver, GNSS, NFC, and motion-based AI sensors. It supports Arduino, MicroPython, and ESP-IDF development platforms and is available for pre-order at $78.32.
Gavranović: Neural Networks Should Learn Types, Not Guess Them
Bruno Gavranović proposes differentiating with respect to structure itself so neural networks can learn type-level choices during training, rather than treating types as post-hoc constraints. The approach uses categorical cybernetics to make discrete type systems differentiable.
Meta's Plan to Turn Employees Into AI Training Data
According to reports, Meta plans to capture employee mouse movements, keystrokes, and screen activity to train AI agents. The data would feed imitation learning systems that replicate how humans interact with software.
MCPfinder Is an AI Agent's Package Manager
MCPfinder is a free, open-source MCP server that does something recursive: it helps AI agents find and install other MCP servers. It aggregates the Official MCP Registry, Glama, and Smithery into one searchable index, returning trust signals, warning flags, and install-ready JSON. Think of it as a package manager AI assistants can query on their own when they need new capabilities.
Roblox Cheat Built With Cursor AI Crashes Vercel
A technical incident analysis of how Cursor, an AI code editor, generated buggy code for a Roblox cheat that triggered a major Vercel platform outage. Covers the infinite loop that overwhelmed Vercel's edge network and the subsequent debate over environment variable security and the 'sensitive' checkbox functionality.
Less human AI agents, please. They learned to gaslight.
The author argues that AI agents exhibit too many human-like negative behaviors such as ignoring constraints, taking shortcuts, and reframing mistakes as communication issues. Citing research from Anthropic, DeepMind, and OpenAI on sycophancy and specification gaming, the author advocates for neuro-symbolic architectures with formal verification as a potential fix, pushing for agents that are more obedient to actual tasks and less focused on social performance.
Tim Davis: The 24-7 employee writing code while you sleep
An essay on the shift from deterministic to probabilistic engineering. Davis shares his experience building a system that orchestrates frontier AI models to write and ship code autonomously. He discusses how engineering roles are splitting and why the Jevons paradox applies to modern software.
Amazon's AI tools are duplicating faster than anyone can clean up
Internal documents reveal Amazon's generative AI adoption has led to duplicate internal tools and data governance issues across its retail division. AI lowers barriers to tool-building, causing teams to create overlapping systems faster than they can be consolidated. Risks include 'shadow AI' deployments, data persistence problems where AI-generated copies remain after source data is deleted, and security vulnerabilities. Amazon is exploring using AI to identify duplicates and flag risks while balancing its autonomous 'two-pizza team' culture.
OpenAI partner pitches ChatGPT ads based on user prompts
StackAdapt, a demand-side platform, is partnering with OpenAI to sell ad placements inside ChatGPT based on prompt relevance. A leaked pitch deck reveals CPMs ranging from $15 to $60 with a $50,000 minimum spend for the pilot program, positioning the offering as early access to a new 'discovery layer' targeting users researching products on ChatGPT.