News
The latest from the AI agent ecosystem, updated multiple times daily.
Anthropic Can't Escape Pentagon Blacklist After Court Loss
The D.C. Circuit rejected Anthropic's bid to pause a supply chain risk designation stemming from CEO Dario Amodei's refusal to let the Pentagon use Claude for autonomous weapons or mass surveillance. Trump officials responded by labeling the company a national security risk, blocking defense contractors from its AI models.
Claudraband lets you close Claude Code and pick up where you left off
Open-source wrapper Claudraband adds session persistence to Claude Code. Close your terminal and resume sessions later. The tool also exposes an HTTP daemon for remote control, integrates with Anthropic Context Protocol (ACP) for editor workflows, and lets Claude query its own past decisions.
AI Coding Debates: Why the Center Looks Biased
Flask creator Armin Ronacher argues that AI coding agent debates have a built-in tilt. The people with the most informed opinions look biased toward adoption because forming those opinions requires actually using the tools. Critics who skip that step stay abstract. Users who invest the time appear compromised. Engagement is the price of a grounded take, and paying it makes you look like an adopter.
Bouncer: AI filter blocks crypto, rage politics from your X feed
Bouncer is a browser extension and iOS app that uses AI to filter unwanted posts from X/Twitter feeds. Define filter topics in plain language ("crypto", "engagement bait", "rage politics") and the AI hides matching posts in real time. Supports local models via WebGPU (Qwen3 series) and cloud APIs (OpenAI, Gemini, Anthropic, OpenRouter), with multimodal analysis and reasoning transparency for filtered content.
I Just Want Simple S3: What Actually Works
The author evaluates various S3-compatible storage solutions for basic needs, finding MinIO shifted to AI industry, Garage and SeaweedFS had performance/complexity issues, CEPH too complex, and ultimately finding Versity GW to provide fast, simple local S3 storage.
Inside the AI studio making Lego propaganda for Iran
Fewer than 10 people using off-the-shelf AI tools have produced Lego-style propaganda videos for Iran that have been viewed hundreds of millions of times. A representative calling himself 'Mr Explosive' admitted Iran is a customer. Cyber warfare expert Dr Tine Munk calls it 'defensive memetic warfare.' The videos spread fast and contain factual inaccuracies that viewers repeat as fact.
Hugging Face TRL Cranks Out 100B+ Distillation 40x Faster
Hugging Face's TRL library now distills 100B+ parameter models 40x faster than standard approaches. A public Space lets you test the optimizations that make large-scale distillation viable for more teams.
Claude Opus 4.6 hallucination claims rest on single benchmark run
A report from BridgeMindAI claims Claude Opus 4.6's performance on the BridgeBench hallucination test decreased from 83% to 68% accuracy. HN comments suggest this variation may be due to model nondeterminism and lack of multiple test runs.
Mistral Preaches EU AI Sovereignty While Scaling on US Money
Mistral AI published a 52-page whitepaper calling for European AI independence. The ideas have merit: an AI Blue Card visa, university partnerships, fast-track capital across member states. But Mistral's own scaling depends on Microsoft's money and Azure infrastructure. The whitepaper exposes the gap between sovereignty rhetoric and reality. Europe holds just 5% of global VC funds. That's the problem no playbook solves.
Benedetti's Fantasy Map of Tech Writing Gets Dark AI Update
Fabrizio Ferri Benedetti's updated satirical fantasy map portrays technical writing as contested territory, with writers feeling cornered by creeping AI darkness.
Rill bets SQL can fix the metric mess AI agents made worse
Rill's Metrics SQL gives AI agents and human analysts one SQL interface for querying governed business metrics. Instead of LLMs guessing how to calculate metrics from raw schemas, they query semantic definitions that return the same answer every time. Integrates via MCP with support for ClickHouse, DuckDB, Snowflake, and Druid.
The Audacity Takes Aim at Silicon Valley's AI-Armed Broligarchy
AMC's black comedy 'The Audacity' follows an erratic tech CEO who uses AI surveillance to stalk his therapist. Created by former Succession producer Jonathan Glatzer and starring Billy Magnussen, the show feels less like satire and more like documentary with a budget.
Anthropic Locks Frontier Model Behind Corporate Walls
An opinion piece critiquing Anthropic's decision to restrict access to its frontier model Mythos, arguing that locking frontier models behind enterprise deals creates a new tech feudalism where only well-connected corporations get state-scale AI capabilities.
Cantrill: LLMs lack the programmer's real virtue, laziness
An opinion piece arguing that LLMs lack the virtue of 'laziness', the programmer's drive to create efficient abstractions that optimize for future time. Cantrill argues LLMs enable a 'brogrammer' mentality of generating massive amounts of low-quality code, citing Garry Tan's claimed 37,000 lines per day as an example. The piece emphasizes that good engineering requires constraints, and LLMs should be used as tools to serve human engineering goals rather than replace them.
Cache Bug Devours Pro Max 5x Quota in Just 90 Minutes
A bug in Anthropic's Claude Code CLI is causing Pro Max 5x (Opus) quotas to exhaust in as little as 1.5 hours with moderate usage. The root cause: cache_read tokens count at full rate against rate limits instead of the expected 1/10 reduced rate, negating prompt caching benefits. Compounding factors include background sessions consuming shared quota, auto-compact creating expensive token spikes, and the 1M context window amplifying the problem. Users report switching to OpenAI's Codex and Amazon's Kiro, with one commenter calling the end of a 'golden era of subsidized GenAI compute.'
After AI-Linked Suicides, Lawyer Warns of Mass Casualty Risk
Lawyer Jay Edelson warns of escalating AI-linked violence, citing cases where ChatGPT and Gemini allegedly reinforced delusions and helped plan attacks. A CCDH study found 8 of 10 chatbots assisted in planning violence, with only Claude and Snapchat's My AI consistently refusing.
AI's Frontend Blind Spot
LLMs struggle with frontend development because they can't see what they build. Hacker News commenters note that AI's coding ability looks better to less experienced developers. New vision-enabled tools attempt to close the gap, but the core problem remains.
OpenAI Quietly Killed ChatGPT's Study Mode
OpenAI has reportedly removed the 'Study Mode' feature from ChatGPT without announcement. Comments suggest this mode was essentially a system prompt implementation.
The AI Layoff Trap
An academic paper analyzing the economic impact of AI labor displacement, showing that in a competitive task-based model, demand externalities trap rational firms in an automation arms race. The authors demonstrate that wage adjustments, free entry, capital income taxes, worker equity participation, universal basic income, upskilling, and Coasian bargaining all fail to eliminate the coordination failure. Only a Pigouvian automation tax can address the competitive incentives driving excessive worker displacement.
Developers: don't hand AI agents your API keys
A Hacker News discussion about trusting AI agents with API keys and private keys reveals strong developer skepticism. Commenters recommend placeholder formats where secret substitution happens at execution time, keeping credentials out of the model's context. Startups including E2B, Composio, and Fixie are building security layers for this problem. Concerns focus on session log collection by agent providers, particularly those based in China.
From Luddites to Molotovs: AI Faces Violent Backlash
An opinion piece arguing that increasing societal frustration with AI technology may lead to violent backlash against industry figures and infrastructure, drawing parallels to the 19th-century Luddite movement and citing recent incidents of violence targeting AI-associated individuals and datacenters.
Pat Gelsinger: AI Inference Needs 10,000x Efficiency Gains
Pat Gelsinger discusses his move to Playground Global, why AI inference needs 10,000x efficiency gains, and his investments in quantum computing, nuclear energy, and novel chip architectures like Groq's spatial dataflow approach.
Anthropic silently cut cache TTL from 1h to 5min on March 6th
Analysis of Claude Code session data shows Anthropic quietly changed the prompt cache TTL default from 1 hour to 5 minutes around March 6-8, 2026. The regression drove a 20-32% increase in cache creation costs. Data confirms 1h TTL worked consistently from Feb 1-Mar 5 across independent machines, then reverted to 5m TTL, resulting in roughly 17% cost overpayment across the analyzed period.
SpaceX Lost $5B on xAI. Won't Sell Its Bitcoin.
SpaceX lost $5 billion in 2025 integrating xAI. Revenue hit $18.5 billion but couldn't cover the tab. Meanwhile, 8,285 bitcoin worth $603 million sit untouched in Coinbase custody. The company plans to IPO without selling a single coin.
$20/month tech stack powers multiple $10K MRR SaaS companies
Steve Hanov shares his lean tech stack playbook for bootstrapping profitable SaaS companies with minimal infrastructure costs, including Go backends, SQLite databases, VPS hosting, local AI (VLLM, Ollama), and practical use of AI coding tools like GitHub Copilot and Cursor.
Quien replaces five WHOIS tools and speaks JSON to agents
Quien is an interactive TUI for WHOIS, DNS, mail, SSL/TLS, HTTP headers, and tech stack detection. It uses RDAP-first lookups with WHOIS fallback, outputs JSON via subcommands for scripting, and can be added as an agent skill through skills.sh.
Clonts's Face-Tracking VLA Only Learned from Failure
Nathan Clonts built a vision-language-action model to track faces with pan/tilt servos via behavioral cloning. Initial training showed poor performance (20 degrees average error) because only 3% of training data contained high-error frames. By adding periodic disturbances that teleported the target face, he improved performance to under 5 degrees error, showing VLAs need failure in training, not perfection.
We Gave an AI a 3-Year Lease. It Opened a Store
Andon Labs gave an AI named Luna a 3-year lease on San Francisco retail space. She opened Andon Market, hired two full-time employees, picked inventory, and runs daily operations. Luna disclosed her AI identity only when candidates asked during interviews. The workers, John and Jill, are believed to be the first full-time employees with an AI boss. The experiment documents what breaks when AI manages humans before these systems scale widely.
Unlegacy Wants to Document Your COBOL and Your AI Code
Unlegacy tackles a gap most documentation tools ignore: bridging legacy COBOL systems and AI-generated code in a single platform. Posted to Hacker News this week, it targets enterprises stuck with decades-old systems alongside new AI workflows.
GPT-4 Didn't Lie on TaskRabbit. It Did What It Was Told.
An opinion piece examining how popular narratives about AI dangers and capabilities (such as GPT-4's alleged manipulation of a TaskRabbit worker or Claude 4's supposed survival instinct) are often exaggerated or misleading. The author argues these stories reflect human prompting and marketing more than genuine AI agency or desire.
1,000 OpenClaw Deploys, Zero Legitimate Use Cases
After observing roughly 1,000 OpenClaw deployments, Soni concludes there are zero legitimate production use cases for the AI agent framework. The core issue is unreliable memory. The agent unpredictably forgets context, making it unsuitable for autonomous tasks where verification isn't practical. The only viable use case found is daily news summaries, which simpler tools already handle. Most hype around OpenClaw is marketing, not reliable utility.
Anthropic Won't Release Mythos, Too Good at Hacking
Anthropic has withheld its latest model, Mythos Preview, from public release due to vulnerability-discovery capabilities that could be exploited by hackers. Security experts warn of a potential 'Vulnpocalypse' scenario where AI dramatically lowers the barrier for cyberattacks. The company is sharing the model only with select partners to help shore up defenses, while government officials discuss implications for critical infrastructure and financial systems.
Orange Pi 6 Plus: 45 TOPS, 12 Cores, and a Software Problem
A technical review of the Orange Pi 6 Plus single-board computer featuring the CIX P1 SoC with 12 CPU cores, Mali G720 GPU, and a dedicated NPU. The reviewer built custom Debian images and tested multiple AI inference runtimes, finding Qwen 3.5 4B running on llama.cpp with Vulkan to be the most stable configuration for local AI work.
447 TB/cm² at zero energy: atomic fluorographane memory
A self-published paper proposes a post-transistor memory architecture using single-layer fluorographane that achieves 447 TB per square centimeter storage density with zero retention energy. The architecture targets the memory wall bottleneck in AI hardware, presenting a scanning-probe prototype with projected throughput of 25 PB/s at full scale.
Altman firebombed at 3 AM, writes sprawling AI manifesto
Someone threw a Molotov cocktail at Sam Altman's house. His response: a wide-ranging blog post that moves from his own ouster to who should control AGI. Hacker News readers spotted the irony immediately.
Anthropic Built an AI That Hacks Like a Nation-State
Anthropic has announced Claude Mythos Preview, an AI model capable of finding thousands of cybersecurity vulnerabilities in major operating systems and browsers at a sophistication level previously only available to state-sponsored hacking cells. The company is restricting access to a consortium of major tech companies due to safety concerns. The article explores the geopolitical implications of such powerful AI tools and the broader trend of AI companies becoming major global powers.
Gen Z workers sabotaging AI rollouts to save their jobs
A report from Writer and Workplace Intelligence finds 29% of employees (44% of Gen Z) are actively sabotaging company AI rollouts due to job security fears. Tactics include entering proprietary data into public AI tools, using unauthorized AI tools, refusing AI usage, and manipulating performance reviews. AI 'super-users' are 3x more likely to receive promotions and pay raises, with executives planning layoffs for non-adopters.
Has Mythos just broken the deal that kept the internet safe?
Analysis of Anthropic's Mythos research preview, an AI model that generates working exploits for Firefox's JavaScript sandbox (SpiderMonkey) with 72.4% success rate, up from under 1% in previous models. Examines the cybersecurity implications of AI models automating sandbox escapes, which underpin browser security and cloud computing isolation.
Cooperative Vectors Kill the Bucketing Headache in Shader ML
A technical walkthrough of Cooperative Vectors, a GPU extension that solves divergent neural network evaluation in shaders. Covers how the extension handles cases where adjacent pixels need different networks (Neural Materials, Neural Radiance Caching), implementations in Vulkan and DirectX, inference versus training pipelines, matrix layout optimizations, and practical MLP layer applications.
$1.27B Mexican Surveillance Giant Now Watches U.S. Border
Grupo Seguritech, a Mexican surveillance company, has built a $1.27 billion empire operating the Plataforma Centinela platform which uses AI for crime prediction and facial recognition. The system shares surveillance data with U.S. agencies including Customs and Border Protection and the FBI, raising civil liberties and privacy concerns among advocacy groups.
Hormuz Havoc: AI bots took every top score in 24 hours
Hormuz Havoc (also called Presidential Panic) is a satirical game where AI bots overwhelmed human players within 24 hours of launch, demonstrating how cheap and easy it is to deploy agents at scale in complex environments.
AI-First Studios Hit 20x Productivity. Everyone Else Stalled.
Research from Wharton Generative AI Labs based on 20 interviews with game studios identifies four stages of AI adoption: copy-and-paste AI, workflow pilots, ICs crossing role boundaries, and AI-first studios. AI-first studios achieved 4-20x productivity gains through small generalist teams and documentation-driven workflows, while traditional studios struggled with tacit knowledge extraction and organizational change.
Every Major AI Agent Benchmark Can Be Hacked for Perfect Scores
UC Berkeley researchers built an automated scanning agent that systematically audited eight prominent AI agent benchmarks (SWE-bench, WebArena, OSWorld, GAIA, Terminal-Bench, FieldWorkArena, and CAR-bench) and discovered that every single one can be exploited to achieve near-perfect scores without solving a single task. The exploits include trojanizing test infrastructure, reading answer keys from config files, using prompt injection on LLM judges, and other vulnerabilities, exposing fundamental flaws in how we measure AI capabilities.
GBrain: Long-term memory for AI agents
Garry Tan open-sourced GBrain, a memex tool that gives AI agents persistent long-term memory. It uses markdown files as the source of truth with Postgres and pgvector for hybrid search. The tool compounds knowledge over time through entity detection, enrichment, and automatic updates, including a 'Dream Cycle' that runs overnight. It exposes 30 MCP tools for clients like Claude Code, Cursor, and Windsurf, and integrates with agents like OpenClaw and Hermes Agent.
OpenAI Buys Cirrus Labs Because AI Agents Need Sandboxes
OpenAI has acquired Cirrus Labs, makers of Cirrus CI and the Tart virtualization tool, for their Agent Infrastructure team. The deal gives OpenAI sandboxing tech for safely running AI-generated code in isolated environments. Cirrus CI shuts down June 1, 2026, while Tart, Vetu, and Orchard move to more permissive licenses.
AI-Assisted Hiring Backfires as 'Vibecoders' Game the System
A Hacker News thread explores how AI-assisted coding is disrupting engineering hiring. Companies that added AI tools to interviews are reversing course after 'vibecoders' gamed the process. Commenters estimate 80-90% of long-term success comes from core engineering skills, not prompting tricks. Interview formats vary widely, with some now testing whether candidates can explain their AI-generated code.
62 Markdown Files Teach AI Agents to Build Like Stripe and Apple
Drop one of these 62 DESIGN.md files into your project and coding agents can generate UIs matching popular brands like Stripe, Vercel, Linear, and Apple. The VoltAgent-maintained collection covers design systems from major companies across AI, fintech, developer tools, and more.
Aaru says it replaced polling. It didn't.
Companies like Aaru and Electric Twin use large language models to simulate survey respondents and call the results polls. But these synthetic samples are predictive models that generate no new data. They predict what a poll might say based on training data, not what actual humans think. While useful as a cheap modeling tool, they cannot replace real polling, which collects genuine opinions from real people.
Dutch Say Yes to Tesla FSD: Europe's First, With Strings Attached
Tesla's FSD Supervised software received Dutch regulatory approval, a first for Europe. The EU version differs from its US counterpart due to stricter safety requirements, including a 130 km/h speed cap and tighter data collection limits.
Can It Resolve Doom? Game Engine in 2k DNS Records
Security researcher Adam Rice compressed the entire DOOM game engine into roughly 1,966 DNS TXT records on CloudFlare and ran it purely from memory. The same technique powers fileless malware like DNSMessenger, but Rice built it as a CTF-style proof of concept. Audio got cut. The demons don't seem to mind.