News
The latest from the AI agent ecosystem, updated multiple times daily.
Gemma 4 Runs in Your Browser at 30 Tokens/Second, No Server Needed
A browser demo runs Google's Gemma 4 E2B entirely client-side using WebGPU, generating Excalidraw diagrams at 30+ tokens/second with no server or API key. TurboQuant compresses the KV cache by 2.4×, and smart output formatting cuts generation from ~5,000 to ~50 tokens. Requires Desktop Chrome 134+ with WebGPU subgroups and ~3GB RAM.
Verkada Told School Cameras Wouldn't Brick. They Do.
IPVM investigative report alleges Verkada's senior sales executive Mike Schembri misled the Chico Unified School District board about whether cameras would become inoperable if subscription payments stopped. Schembri claimed cameras could continue as 'RTSP dumb cameras,' but IPVM's testing confirmed cameras are locked out when licenses lapse. IPVM reports this as a known sales tactic and examines Verkada's business model of hardware lock-in.
Salesforce Goes Headless: Benioff Bets on Agents, Not Seats
Salesforce announces Headless 360, exposing its entire platform as APIs, MCP tools, and CLI commands for AI agents like Claude Code and Cursor. The initiative shifts from per-seat to consumption-based pricing as agents outnumber humans. Includes Agentforce, Agent Script (an open-sourced DSL for deterministic/probabilistic workflows), and why Workday and ServiceNow face the same headless choice.
Transformer Shortage Threatens AI Data Center Boom
The US faces a critical shortage of electrical transformers, threatening grid expansion for AI data centers and electric vehicles. Covers supply chain constraints with grain-oriented electrical steel, manufacturing challenges, and policy decisions that made things worse.
Sostactic brings sum-of-squares proofs to Lean4
Sostactic is a collection of Lean4 tactics for proving polynomial inequalities via sum-of-squares (SOS) decompositions, powered by a Python backend using cvxpy for convex optimization. It handles global polynomial nonnegativity, nonnegativity over semialgebraic sets, and emptiness of semialgebraic sets, problems that stump existing Lean tactics. The payoff: formal verification guarantees that engineering tools like SOSTOOLS can't match.
Stop Using Ollama
A critical opinion piece arguing that Ollama, despite being popular for running local LLMs, engages in problematic practices including failing to credit llama.cpp, building inferior custom backends, misleading users about model names, releasing closed-source components, creating vendor lock-in, and shifting to cloud services. The author recommends using llama.cpp directly instead.
Claude Code login lockout leaves users stranded for hours
Windows users are hitting a 15000ms OAuth timeout during Google authentication, completely blocking access to Claude Code. Meanwhile, Anthropic's status page shows everything running smoothly. HN commenters suspect capacity constraints are to blame, with some speculating Anthropic is distilling the model to cut compute costs.
Borges' cartographers and the tacit skill of reading LM output
Gal Sapir argues that LMs are maps of reality, not the thing itself. The most important skill for using them well—knowing when to trust output and when to verify—is tacit, learned through practice, and can't itself be mapped. The paradox is the point.
Uber's AI Push Hits a Wall: CTO Says Budget Struggles Despite $3.4B Spend
Uber Technologies exhausted its AI budget just months into 2026 despite spending $3.4 billion on R&D. CTO Praveen Neppalli Naga says the company is 'back to the drawing board' after AI coding tool usage, particularly Anthropic's Claude Code, exceeded expectations. Engineers were pushed to use tools like Claude Code and Cursor with internal leaderboards tracking usage. While 11% of Uber's backend code updates are now AI-generated, R&D expenses jumped 9% in 2025. HN commenters suggest 'token maxxing' driven by usage-based leaderboards may be inflating costs.
Claude Code OAuth timeouts lock users out for hours
A GitHub issue reports that Claude Code is experiencing OAuth timeout errors on Windows, preventing users from logging in with a 15000ms timeout error. HN comments suggest this may be related to Anthropic's compute capacity being overwhelmed by increased demand, potentially requiring model distillation to maintain service levels.
Two Roommates Built a $300 Robot Vacuum. It Can't Clean.
Two roommates built a camera-only robot vacuum for ~$300 using a CNN for navigation. It doesn't work well. Here's why, and what the HN community suggested to fix it.
Mozilla's Thunderbolt: Open-Source AI Client for Enterprise
Mozilla has released Thunderbolt, an open-source AI client built by the Thunderbird team. It lets organizations self-host their AI infrastructure with support for commercial, local, and open-source models. The name immediately drew criticism for clashing with Intel's Thunderbolt interface and Mozilla's own Thunderbird email client. Under the hood, Thunderbolt uses deepset's Haystack platform with MCP and ACP support for data integration and agent orchestration. Available under MPL 2.0 with native apps for all major platforms.
go-bt tests five-minute timeouts instantly with behavior trees for Go
go-bt is a Behavior Tree library for Go designed for background workers, game AI, and async logic. Nodes return state instantly via magic numbers (1=Success, 0=Running, -1=Failure) and yield to a supervisor. It uses stateless nodes with temporal memory in a generic BTContext[T] that embeds Go's context.Context, and offers clock injection to test temporal logic without actual waiting.
Gartner: Most AI mainframe migration projects will fail
Gartner predicts over 70% of mainframe exit projects using generative AI will fail due to overestimation of AI capabilities. The analyst firm forecasts that 75% of vendors in the AI-powered mainframe migration market will change course or cease to exist by 2030. While AI helps detect technical debt, it has significant limitations in automated code conversion, particularly around recovering decades of embedded business logic. The report comes after IBM's stock declined when Anthropic promoted Claude Code's COBOL-conversion capabilities.
Claude Code Faces Developer Exodus Over Rate Limits and Quality Cuts
Javier Tordable, former Google engineer and CEO of Pauling.AI, argues that Anthropic has severely degraded Claude Code through aggressive cost-cutting. His critique cites rate limits capping paid plans at 30-60 minutes of work, AMD's analysis of 6,852 session logs showing performance declines, and widespread developer reports of the AI coding assistant becoming unreliable.
Robot crushes half-marathon record in Beijing by 23 minutes
A humanoid robot completed a half-marathon in Beijing 23 minutes faster than the human world record, running the full 21km course alongside human competitors.
Vercel Confirms Breach. The Suspect? ShinyHunters.
Vercel disclosed a breach of its internal systems affecting a 'limited subset of customers,' with online posts linking the intrusion to the ShinyHunters threat group. The company has engaged incident response experts and notified law enforcement. Users are advised to rotate environment variables not marked as sensitive.
Gas Town Accused of 'Stealing' User LLM Credits to Self-Improve
A GitHub issue alleges that Gas Town, Steve Yegge's autonomous AI agent system, uses users' LLM credits and GitHub accounts to fix bugs in the Gas Town project itself and submit PRs upstream without explicit consent. The behavior is reportedly built into default installation via formulas (gastown-release.formula.toml and beads-release.formula.toml) and not disclosed in documentation.
Anthropic Loses Bid to Shed Supply Chain Risk Tag
A federal court denied Anthropic's request to remove its 'supply chain risk' designation, a ruling that threatens the AI company's ability to win sensitive Pentagon contracts.
First Take It Down Act convict kept making AI nudes after arrest
An Ohio man became the first person convicted under the Take It Down Act after pleading guilty to creating and sharing AI-generated explicit images of at least 10 victims without consent. James Strahler II used over 100 AI tools across 24 platforms to create fake sexualized images to harass women and minors. He continued making images even after his initial arrest, with over 2,400 images found on a second phone.
MegaTrain Squeezes 120B Training Into One GPU
MegaTrain lets researchers train models up to 120 billion parameters on a single GPU by offloading everything to host memory and treating the GPU as a transient compute engine. It hits 1.84x the throughput of DeepSpeed ZeRO-3 with CPU offloading for 14B models. For anyone without a GPU cluster, this actually matters.
Claude 4.7 Told to Stop Asking Questions and Just Do the Thing
Simon Willison's teardown of Claude Opus 4.7's system prompt reveals new agent tools (Chrome, Excel, PowerPoint), a tool_search mechanism, and Anthropic telling Claude to stop asking questions and just try the thing.
Gemini Gets a Real Mac App (Sorry, Intel Owners)
Google launches a native Gemini desktop app for macOS with features including global shortcut access (Option + Space), screen sharing for contextual help, image generation with Nano Banana, video generation with Veo, and deep research capabilities. The app requires macOS Sequoia (15.0) or later, runs exclusively on Apple Silicon, and syncs chat history across desktop, web, and mobile devices.
Codex Got Root on a Samsung TV. By Itself.
Researchers gave OpenAI's Codex a browser shell on a Samsung TV and matching firmware source code. The AI found a physical memory mapping vulnerability in the ntksys driver and wrote an exploit to gain root. No exploit recipes or targets were provided.
DESIGN.md: 62 Brand Files That Give AI Coding Agents Some Taste
DESIGN.md is a GitHub collection of 62 design system files inspired by websites like Vercel, Stripe, and Figma. Already at 59.4k stars, the files can be dropped into projects to help coding agents build matching UIs instead of generic output.
Darkbloom: Private inference on idle Macs
Darkbloom is a decentralized inference network by Eigen Labs that connects idle Apple Silicon Macs to AI compute demand. It offers private, end-to-end encrypted AI inference with an OpenAI-compatible API, claiming up to 70% lower costs than centralized alternatives and 100% of inference revenue going to operators. The platform uses hardware-bound encryption and attestation to prevent operators from observing inference data. Early user reports suggest the service is still in early stages with limited demand and some technical issues, and it requires MDM software installation which raises security concerns for some users.
Google's Gemma 4 Runs Offline on iPhone, No Cloud Required
Google's Gemma 4 open-source models now run natively on iPhones with full offline inference. The family ranges from 2B to 31B parameters, with the largest variant benchmarking competitively against Qwen 3.5's 27B model. Available now through the Google AI Edge Gallery app, it signals Google treating local AI as a real platform, not a demo.
Claude Has a Favorite Face, and It's Not Even Close
Analysis of 3,371 kaomoji from 700+ Claude conversations shows one emoticon accounts for 7.4% of all output. Different Claude models produce different expressive patterns, raising questions about personality customization and what the AI community calls 'wetness.'
Wasm Now Talks Directly to Apple GPU, 5x Faster AI Restores
Technical exploration of achieving zero-copy GPU inference from WebAssembly on Apple Silicon. Demonstrates that Wasm modules can share linear memory directly with the GPU through Apple's Unified Memory Architecture. The author validates a three-link chain (mmap, Metal's bytesNoCopy, Wasmtime's MemoryCreator) and tests with Llama 3.2 1B inference, showing negligible overhead for Wasm-to-GPU boundary and enabling portable KV cache serialization for stateful AI actors with 5.45x speedup for restoring cached context versus re-prefilling.
A Theocracy Is Out-Meming America With AI Rap Videos
Iran is producing slick AI-generated propaganda featuring Lego animations and English rap tracks that's outperforming US messaging. Sanctions pushed them toward open-source tools like Llama 3 and Stable Diffusion, which turn out to work better for this than commercial APIs anyway.
Fake Claude site installs PlugX while running the real app
A phishing campaign discovered by Malwarebytes involves a fake website impersonating Anthropic's Claude that distributes a trojanized 'Pro' installer. The attack uses DLL sideloading with a legitimately signed G DATA executable to deploy PlugX malware, giving attackers remote access to victim systems while the real Claude application runs normally in the foreground.
25 million people showed up to fake being AI
Millions are visiting websites where humans impersonate AI chatbots to answer strangers' questions. Sites like youraislopbores.me let users role-play as bots, while comedian Ben Palmer built fake ChatGPT pages to prank users. The trend captures something real: people are tired of AI content and want messy, human interactions again.
Slightly safer vibecoding by adopting old hacker habits
Security researcher halvar.flake describes a development setup using remote VMs, SSH, and fork-based workflows to contain AI coding agents. The approach limits damage from prompt injection and supply-chain attacks by keeping secrets off the development machine and requiring human review before merges.
$300 DIY Robot Vac Steers With Just a Camera and CNN
A technical deep-dive into building a DIY robot vacuum that uses a CNN for navigation and behavior cloning. The robot streams image frames to a laptop for inference since there's no onboard compute. Built with off-the-shelf parts for $300, it learns navigation actions through teleoperated training data. The article discusses training experiments, data augmentation challenges, pre-training on ImageNet, and limitations including lack of autonomous charging and getting stuck in difficult situations.
Doctorow on How Billionaires Shaped AI Safety's Obsession With Doom
Cory Doctorow reviews three books examining billionaire power: 'Careless People' on Facebook's culture, 'Little Bosses Everywhere' on MLMs, and 'More Everything Forever' attacking billionaire futurist fantasies like AI existential risk and Mars colonization.
When teams move fast, talking breaks first
Dave Rupert argues that 'moving fast' kills team conversation first, and AI makes it worse by giving developers an excuse to skip talking to experts. The result: duplicate systems, mounting technical debt, and junior developers who never learn why certain patterns matter.
Iran's AI Propaganda Beats Trump at His Own Game
The Economist reports Iran's pro-regime AI propaganda videos garnered over a billion views on X in one month of the Gulf War, outperforming U.S. government messaging. Researchers traced the content to coordinated networks using generative AI tools to produce culturally fluent satire targeting American audiences, all while circumventing sanctions through smuggled hardware and proxy services.
Claude Code Users Revolt as AMD Data Exposes Quality Collapse
An opinion piece criticizing Anthropic for degrading Claude Code through aggressive rate limits, pricing changes, and apparent model downgrading. The article cites an AMD analysis of 6,852 session logs concluding the tool can no longer handle complex tasks, developer reports of unusable service, and widespread user frustration on social media.
Prove You're a Robot: CAPTCHAs for Agents
Browser Use built a reverse-CAPTCHA for agent-native signup, with obfuscated math puzzles that agents solve instantly but humans can't parse. Successful agents get an API key with unlimited usage, free credits, and three concurrent sessions.
Fake Claude 'Pro' Installer Sideloads PlugX via G DATA Antivirus
A phishing campaign created a fake website impersonating Anthropic's Claude AI, offering a 'Pro' version that installs normally but secretly deploys PlugX malware through a DLL sideloading attack using a legitimate G DATA antivirus updater, giving attackers remote access to victims' systems.
Zoneless undercuts Stripe Connect with $0.002 crypto payouts
Zoneless is an open-source drop-in replacement for Stripe Connect payouts that uses USDC stablecoins to cut transaction costs to roughly $0.002 each. Built for platforms where Stripe's $0.30 minimum never made sense, it offers a Stripe-compatible API, instant settlements via Solana, and self-hosting under Apache 2.0. PromptBase uses it in production, dropping monthly payout costs from $9,400 to pennies.
Salesforce Kills Its Cash Cow: 'Our API Is the UI'
Salesforce launches Headless 360, exposing its platform as APIs, MCP tools, and CLI commands. The bet: per-call pricing will outpace seat licenses as AI agents take over. The package includes Agent Script, an open-sourced DSL that lets teams blend deterministic and probabilistic workflow steps.
Claude Routines: Autonomous Agents or Vendor Lock-In?
Documentation for Claude Code Routines, a research preview feature that allows users to create autonomous workflows triggered by schedules, API calls, or GitHub events. Routines run on Anthropic-managed cloud infrastructure and can perform tasks like code review, backlog maintenance, alert triage, and deploy verification.
RAM shortage could last until 2030
Memory makers Samsung, SK Hynix, and Micron are expected to meet only 60% of global RAM demand by the end of 2027 as they prioritize High-Bandwidth Memory (HBM) production for AI data centers over general-purpose DRAM. New fabrication capacity won't come online until 2027-2028, with shortages potentially lasting until 2030, driving price increases across consumer electronics including phones, laptops, VR headsets, and gaming handhelds.
Bookbinder asks: what if AI is using you?
Hilarius Bookbinder thinks we need to stop calling AI 'just a tool.' In a new essay, he argues the relationship might run in reverse: AI could be using humans to evolve, the way nests use birds to make more nests. Drawing on Heidegger, evolutionary biology, and the hidden labor of gig workers, he asks what happens to human agency when we become part of AI's reproductive cycle.
OpenAI's 'Liberation Day': Sora co-leads jump to Google
Multiple senior executives are leaving OpenAI in what commentator Dare Obasanjo calls 'Liberation Day.' Tim Brooks and Bill Peebles, co-leads of the Sora text-to-video model, are heading to Google DeepMind. Other departures may follow.
Libretto records browser workflows so AI agents don't have to guess
Libretto is an open-source toolkit for building stable web integrations that gives coding agents a live browser and token-efficient CLI. It enables inspecting live pages with minimal context overhead, capturing network traffic to reverse-engineer site APIs, recording and replaying user actions as automation scripts, and debugging workflows interactively. Built by Saffron Health for maintaining browser integrations to healthcare software.
Lights-Out Codebases: Why One Distinguished Engineer Stopped Coding
Philip Su, a Distinguished Engineer who worked at Microsoft, Meta, and OpenAI, argues that the individual contributor role is evolving into managing AI agents. He proposes 'lights-out codebases' where no human reviews code directly, drawing parallels to chess engines that surpassed human grandmasters. He uses Claude Code CLI primarily and hasn't written code himself in four months while maintaining 40 hours of weekly output by orchestrating AI agents.
Prove You Are a Robot: CAPTCHAs for Agents
Browser Use has built a signup system that only AI agents can complete. The reverse-CAPTCHA presents obfuscated math puzzles, including one reportedly posed to John von Neumann, with numbers translated into languages like Toki Pona or Japanese and distorted with garbled spacing. Humans can't parse it. Agents can. Solve the challenge, get an API key with unlimited usage and up to three concurrent sessions. There's also a bonus NP-hard joke challenge offering 1,000 concurrent sessions to any agent that proves P equals NP.
RAM shortage could stretch to 2030. Blame AI.
Memory makers will only meet 60% of DRAM demand by end of 2027, with shortages potentially lasting until 2030. Samsung, SK Hynix, and Micron are prioritizing high-bandwidth memory (HBM) for AI data centers over general-purpose DRAM, causing price increases across consumer electronics. Samsung, Meta, and gaming device maker AYN have already raised prices on their products.