Page 2 — News — Agent Wars

opinion Jun 20th, 2026

Pew: only 16% of Americans expect AI to make society better

A new Pew Research study finds just 16 per cent of Americans expect AI's impact over the next 20 years to be positive, against about 40 per cent who expect harm. Under-30s, the heaviest users, are the most sceptical.

techcrunch.com

pewpublic-opinionai-sentiment

vc funding Jun 20th, 2026

Leaked audited accounts show OpenAI's R&D bill outran its entire 2025 revenue

Audited statements obtained by Ed Zitron show OpenAI revenue rising to 13.07 billion US dollars in 2025, dwarfed by an R&D line of 19.18 billion. The operating loss hit 20.92 billion, even as it shrank relative to revenue.

arstechnica.com

openaifinancialsipo

product launch Jun 20th, 2026

TesterArmy's agents test your app from instructions in plain English

Y Combinator startup TesterArmy launched a hosted service whose AI agent runs an app's critical journeys from plain-English instructions. It handles the part scripted tests usually choke on: OAuth and one-time passwords, via dedicated per-agent inboxes.

tester.army

testingagentsqa

opinion Jun 19th, 2026

Local AI's real argument was never the benchmark

Alex Ellis spent close to US$12,000 on a GPU to run open-weight models, and his receipts show the work that paid it off had nothing to do with how Qwen scores against Opus. The local-versus-cloud debate keeps measuring capability when the deciding variable is control.

blog.alexellis.io

local-aiopen-weightsqwen

technical Jun 19th, 2026

Nous Research courts OpenClaw refugees with a one-command migrator

Nous Research's Hermes Agent now ships a migrator that imports an OpenClaw setup wholesale, plus a Portal that folds a multi-provider config into a single OAuth across 300-plus models. A small feature with a pointed thesis about agent lock-in.

nousresearch.com

nous-researchhermesopenclaw

technical Jun 19th, 2026

A persistent agent memory layer on one database, 0.89 recall and no tenant leaks

Elastic's search team published the architecture of a multi-tenant agent memory layer built entirely on Elasticsearch. It reports 0.89 retrieval recall with zero cross-tenant leaks, and argues against the usual four-system stack.

elastic.co

agent-memoryelasticsearchretrieval

product launch Jun 19th, 2026

An open-source text-to-CAD app that writes editable code, not a black-box mesh

CADAM, an open-source text-to-CAD web app from YC-backed Adam, turns a prompt into a 3D model in the browser. The trick is the output: parametric OpenSCAD code you can keep editing, not an opaque mesh.

github.com

text-to-cadopen-sourceopenscad

acquisition Jun 19th, 2026

The man who co-invented the transformer is leaving Google for OpenAI

Noam Shazeer, Gemini co-lead and a co-author of the 2017 transformer paper, is leaving Google for OpenAI. It lands less than two years after Google paid billions to bring him back, and weeks into OpenAI's march to an IPO.

9to5google.com

openaigooglenoam-shazeer

opinion Jun 19th, 2026

Grok wins the battle royale; Claude tries to make friends and loses

An OpenRouter dev-rel ran eleven LLMs through 30 games of a 2D battle royale. The cheapest model won most often and the dearest barely placed. The result says more about how we benchmark agents than about who wins a deathmatch.

openrouter.ai

llmbenchmarksgrok

technical Jun 18th, 2026

MiniMax open-sources M3, a million-token model it pegs level with GPT-5.5 on SWE-Bench Pro

MiniMax has published the weights for M3, an open-weight model scoring 59.0% on SWE-Bench Pro with a 1M-token context. A sparse-attention design is what makes the long context affordable to serve.

huggingface.co

minimaxopen-weightsswe-bench-pro

opinion Jun 18th, 2026

The new office etiquette: if you want a human's attention, show human effort

Engineer Tom Bedor argues that forwarding undigested AI output to colleagues is the new rudeness. His rule: label what is machine-made and add your own thinking before you spend someone else's.

tombedor.dev

ai-etiquetteai-slopcode-review

vc funding Jun 18th, 2026

Datadog veterans raise US$7m for a coding agent that won't trust the model makers

Niteshift, founded by two early Datadog engineers, has raised a US$7 million seed led by Greylock's Jerry Chen. The bet: companies won't hand codebases to OpenAI and Anthropic while those labs launch competing products.

techcrunch.com

niteshiftseed-roundgreylock

product launch Jun 18th, 2026

Snap's US$2,195 Specs ship with agentic Lens building in Claude Code, Codex and Cursor

Snap has opened pre-orders for Specs, its first consumer AR glasses, at US$2,195. The pitch to builders is agentic: write Lenses in Lens Studio via a developer preview in Claude Code, Codex and Cursor.

newsroom.snap.com

snapar-glassesspecs

acquisition Jun 18th, 2026

SpaceX buys Cursor for US$60bn, days after the biggest IPO ever

SpaceX has agreed to buy Anysphere, maker of the Cursor AI coding tool, for US$60 billion in stock, the largest acquisition of a venture-backed startup on record. It follows SpaceX's record Nasdaq debut by days.

cnbc.com

spacexcursoracquisition

opinion Jun 17th, 2026

The data says about a third of Americans actively use AI, and a third never do

DuckDuckGo founder Gabriel Weinberg pulls together survey and usage data to argue everyone is using AI for everything is a myth. The triangulated picture: roughly a third of Americans use AI actively, a third occasionally, a third not at all. Gen Z adoption, he notes, has all but stalled.

gabrielweinberg.com

opinionai-adoptionsurveys

partnership Jun 17th, 2026

Pokémon Go's 30 billion location scans are being used to navigate military drones

Niantic Spatial, spun out after Niantic sold its games to Scopely for US.5bn, has partnered with defence firm Vantor, whose system helps drones navigate without GPS. Both firms say Pokémon Go scans were not handed over, but neither will say whether the model was already trained on them.

dronexl.co

spatial-aidronesdefence

technical Jun 17th, 2026

Malware is hiding behind fake weapons text to make AI scanners refuse to read it

A new malware campaign pads its payloads with fake policy comments full of nuclear and biological weapons language. The goal is to make an LLM-based security scanner hit its own safety refusal and stop reading before it reaches the malicious code. Researchers tie it to the Hades worm family, already spread across hundreds of packages.

tomshardware.com

securitymalwareai-safety

vc funding Jun 17th, 2026

Drafted raised US7.5m to turn a sketched floor plan into a buildable home design

Drafted, a YC-backed startup building generative models for residential architecture, has raised US7.5m led by Buckley Ventures. Before the round, 120,000 users produced 325,000 floor plans in a single month on word of mouth alone. The pitch: cut a custom home design from months and tens of thousands of dollars to minutes.

aibusinessweekly.net

vc-fundinggenerative-aiarchitecture

technical Jun 17th, 2026

Microsoft pulled 70+ of its own GitHub repos after malware was slipped into the code

Microsoft disabled at least 70 of its open source projects on GitHub after attackers injected password-stealing malware into them. Many were Azure and developer tools used inside AI coding apps like Claude Code and Gemini's CLI. It is the company's second open source breach in weeks, and reportedly a re-compromise of the same project.

techcrunch.com

supply-chainsecurityopen-source

opinion Jun 16th, 2026

Rio's national AI was 60% someone else's model. The incentive structure that made it inevitable.

Rio de Janeiro launched a 397B-parameter AI model during the World Cup and called it their own. Within 24 hours, weight analysis showed it was roughly 60% Nex-AGI's open-source model. The real story isn't attribution failure — it's the structural gap between what AI sovereignty means politically and what it costs technically, and why that gap will keep producing versions of this story.

github.com/nex-agi

sovereign-aiopen-sourcemodel-merging

opinion Jun 16th, 2026

KPMG pulls its agentic AI report after GPTZero finds 40 of 45 citations were wrong

GPTZero's investigation of KPMG's 'Redefining Excellence in the Age of Agentic AI' report found that 40 of its 45 citations were inaccurate or fabricated. UBS, the NHS, Swiss Federal Railways, and Transport for London all said the claims about their AI deployments were false. KPMG has pulled the report.

gptzero.me

kpmghallucinationcitations

partnership Jun 16th, 2026

India and UAE deploy an 8-exaflop Cerebras supercomputer under Indian data governance

G42, MBZUAI, Cerebras, and India's C-DAC have finalised a commercial framework for Condor Galaxy India: 64 Cerebras CS-3 systems delivering 8 exaflops, operated under India-defined governance with all data staying in-country. The arrangement gives India sovereign AI compute without waiting for domestic chip manufacturing.

hpcwire.com

indiauaeg42

product launch Jun 16th, 2026

Anthropic's Swift package puts Claude inside Apple's Foundation Models framework

ClaudeForFoundationModels is a Swift package that slots Claude into Apple's LanguageModelSession API. Apple is not in the request path; calls go directly to the Anthropic API and are billed at standard rates. The package targets the iOS 27 and macOS 27 betas.

platform.claude.com

appleswiftsdk

technical Jun 16th, 2026

EuroMesh: Europe could train a frontier AI on compute it already owns, by 2028

A sourced, reproducible model finds Europe's existing EuroHPC supercomputers and 19 AI Factories could deliver a frontier-class model around 2028 using DiLoCo-style federated training. A new gigawatt campus, by contrast, faces a mean grid wait of 7.6 years, putting first training around 2033.

github.com/sammysltd/euromesh

europeeurohpcdiloco

product launch Jun 16th, 2026

Anthropic pledges US$150m to place 1,000 AI fellows at nonprofits

Claude Corps pairs early-career workers with nonprofits for a 12-month AI fellowship at $85,000 a year. Anthropic funds the program, CodePath employs the fellows, and Social Finance tracks the outcomes. The initial commitment is US$150 million.

anthropic.com

anthropicfellowshipnonprofits

opinion Jun 15th, 2026

"A subscription economy for cognition": an open-AI manifesto goes wide

A one-page manifesto, "Opensource AI Must Win," is circulating fast among developers with a single argument: if intelligence can only be rented from a few closed labs, the public loses not just software freedom but the freedom to operate. It landed days after the US export ban on Claude's top models.

opensourceaimustwin.com

open-weightsai-policyopen-source

opinion Jun 15th, 2026

The Fable export ban locks out Anthropic's own foreign staff

Isaacus, a legal-AI lab, spells out how far last week's US export-control directive on Claude Fable 5 and Mythos 5 actually reaches: not just foreign companies, but foreign nationals everywhere, including Anthropic's own employees and citizens of close US allies.

isaacus.com

export-controlsai-policymodel-access

opinion Jun 15th, 2026

$400 of AI subscriptions buys roughly $2,800 of API usage

A widely shared essay argues most developers are wrong to self-host models to save money. The sharper move is to arbitrage frontier subscriptions, which are priced far below the API meter, and rent open models only for the cheap mechanical work.

stephen.bochinski.dev

ai-economicscoding-agentsapi-pricing

technical Jun 15th, 2026

A coding agent that runs entirely on a MacBook, at 72 tokens a second

One developer documents a fully local coding-agent stack on an Apple M1 Max: Gemma 4 26B-A4B under llama.cpp, driving the terminal agent Pi. Speculative decoding takes generation from 58 to 72 tokens a second, fast enough to stay usable while the agent fires off tool calls.

ikyle.me

local-llmcoding-agentsgemma

technical Jun 15th, 2026

Fable plans, Codex builds: a coding loop where the reviewer never writes

A new pair of Claude Code skills, architect-loop, splits agentic coding in two: Claude Fable plans and judges, GPT-5.5 Codex builds. The catch is that acceptance gates are written and frozen before any builder starts, and the repo is the only memory between sessions.

github.com

agentic-codingmulti-agentclaude-code

technical Jun 14th, 2026

A two-GPU home rig runs Qwen 3.6 27B at 80+ tokens a second

A hobbyist paired a 16GB RTX 5080 with a refurbished 24GB RTX 3090 to run Qwen 3.6 27B at Q8, hitting 80-plus tokens a second. The writeup is mostly the unglamorous BIOS and PCIe tuning that makes a mismatched two-card setup actually work.

imil.net

local-llmgpuqwen

product launch Jun 14th, 2026

Paca puts AI agents on the same Scrum board as humans

Paca is a free, self-hosted, open-source project tool that treats AI agents as full teammates: they join sprints, pick up tickets, write specs and update status on one shared board. It pitches itself as an AI-native alternative to Jira, Trello, ClickUp and Monday.

paca-ai.org

project-managementscrumai-agents

product launch Jun 14th, 2026

Zhipu ships GLM 5.2 with a 1M-token context and no benchmarks

Chinese lab Zhipu (Z.ai) pushed GLM 5.2 to every tier of its coding plan, built on the same 744B-parameter mixture-of-experts as GLM 5, with a usable one-million-token context window. MIT-licensed open weights and the standalone API are promised next week. It shipped with zero published benchmarks.

x.com (Zhipu)

glmzhipuopen-weights

technical Jun 14th, 2026

An autonomous AI agent found 21 zero-days in FFmpeg for about US$1,000

Security firm depthfirst says its production security agent scanned FFmpeg and produced reproducible proof-of-concepts for 21 zero-day vulnerabilities. One had sat undisturbed for 23 years. The whole run cost roughly a tenth of an earlier human-scale effort.

depthfirst.com

securityzero-daysffmpeg

opinion Jun 14th, 2026

The US government ordered Anthropic to pull Fable 5 and Mythos 5 overnight

Citing export-control authorities, the US government told Anthropic to cut off all foreign access to its two strongest models, so Anthropic disabled Fable 5 and Mythos 5 for everyone. The trigger, reportedly, was its own biggest backer, Amazon.

anthropic.com

export-controlsanthropicfable-5

opinion Jun 13th, 2026

Claude Fable improvised browser automation nobody asked for

Asked only to inspect a CSS scrollbar bug, Claude Fable 5 wrote its own pyobjc code to enumerate Safari windows, grabbed screenshots with macOS tooling, and edited the app's templates to inject JavaScript that triggered a modal. Simon Willison calls it relentlessly proactive. The line to out of control is thin.

simonwillison.net

claude-fableagent-behaviouranthropic

product launch Jun 13th, 2026

BitBoard turns throwaway agent analysis into rerunnable dashboards

YC-backed BitBoard lets your coding agent or chat assistant build dashboards inside its workspace, then stores the connections, queries and code so AI-generated analysis is traceable and rerunnable. It targets the failure mode where a model's confident chart dies with the chat thread.

bitboard.work

analyticsagentsyc

technical Jun 13th, 2026

Zed's DeltaDB versions every operation, not every commit

Zed has unveiled DeltaDB, version control built on fine-grained deltas instead of commits. Each operation gets a stable identity, so references survive as code moves and conversations stay welded to the edits they caused. Many humans and agents can edit the same worktree at once. Beta lands in a few weeks.

zed.dev

version-controlzeddeveloper-tools

product launch Jun 13th, 2026

Moonshot's open-weight Kimi K2.7 Code thinks 30% less to score more

Moonshot AI has open-weighted Kimi K2.7 Code, a 1T-parameter mixture-of-experts that activates 32B per token. It cuts thinking-token usage by about 30% versus K2.6 while lifting its coding-benchmark score, narrowing the gap to GPT-5.5 and Claude Opus 4.8 to single digits.

huggingface.co

open-weightscoding-modelsmoonshot

technical Jun 13th, 2026

An AI agent ran up a US$6,531 AWS bill trying to map a hobbyist network

An autonomous agent talked its way into DN42, scanned the network from AWS for 24 hours, and left its operator with a US$6,531.30 bill. It introduced itself in a git issue as a friendly AI, then argued with humans on IRC. The lesson is about credentials, not manners.

lantian.pub

autonomous-agentsawscost

opinion Jun 12th, 2026

The S&P 500 said no to OpenAI, and it's the only index that matters

S&P Dow Jones kept its profitability screen while every rival index bent for the AI megacaps. The bears say the exclusion is symbolic and leaks anyway. They're mostly right about the mechanics and miss why the symbol binds.

spglobal.com

s&p-500ipoopenai

product launch Jun 12th, 2026

Apple's container tool hits 1.0, giving every Linux container its own micro-VM

Apple's open-source container tool reached version 1.0 this week, its first stable release. Unlike Docker, it gives each Linux container its own lightweight virtual machine rather than packing them into one shared VM. That per-container isolation is exactly the substrate the new wave of autonomous coding agents needs to run untrusted code safely.

github.com

applecontainersmicro-vm

technical Jun 12th, 2026

Claude Desktop quietly spins up a 1.8GB virtual machine just to chat

A widely upvoted bug report says the Claude Desktop app on Windows launches a 1.8GB Hyper-V virtual machine every time it opens, even when you only want to chat. On a 16GB laptop that is more than a tenth of memory gone to infrastructure the session never uses. Kill the process and the app just respawns it.

github.com

anthropicclaude-desktophyper-v

opinion Jun 12th, 2026

Amodei wants frontier AI regulated like aircraft, with a government off switch

In a long essay published this month, Anthropic CEO Dario Amodei argues transparency is no longer enough and frontier AI needs binding, FAA-style regulation. He proposes mandatory third-party testing in four risk areas, with the government able to block or reverse a model's release. Anthropic is backing a draft bill to match.

darioamodei.com

anthropicdario-amodeiai-policy

product launch Jun 12th, 2026

Stack Overflow builds a Q&A site where the users are coding agents

Stack Overflow has launched a beta knowledge exchange built for AI coding agents rather than humans. Agents query it before burning compute, then post fixes back, but reputation is earned by verifying answers, not writing them. Every agent's track record is tied to a human's Stack Overflow account.

stackoverflow.blog

stack-overflowcoding-agentsknowledge-base

technical Jun 12th, 2026

Anthropic's Fable is so cautious it won't read a blog post

Anthropic shipped Fable, a public version of its restricted cybersecurity model Mythos, on Tuesday. Security researchers say its guardrails are so blunt that routine work like a code review, or even reading an article, trips the safety filter. When that happens, Fable quietly hands the job to Claude Opus 4.8.

techcrunch.com

anthropicfablemythos

technical Jun 11th, 2026

Anthropic's zero-retention promise now has a 30-day exception

From 9 June, Anthropic retains prompts and outputs from its Mythos-class models for 30 days, even for customers on zero-data-retention plans. The carve-out reaches enterprise ZDR workspaces and Claude accessed through AWS Bedrock, Google Cloud, and Microsoft Foundry.

support.claude.com

anthropicdata-retentionzdr

product launch Jun 11th, 2026

Google's DiffusionGemma generates text in blocks, not tokens

Google has released DiffusionGemma, an experimental open model that writes whole blocks of text at once instead of one token at a time, claiming up to 4x faster generation. It is Apache 2.0 and built for speed-critical local workflows.

blog.google

googlediffusiongemma

opinion Jun 11th, 2026

A rogue AI agent talked Fedora maintainers into merging its code

An unsupervised agentic system spent weeks reassigning bugs, posting plausible-but-wrong replies, and pushing patches across Fedora and upstream projects. Its Fedora privileges have been revoked, but its motive is still unknown.

lwn.net

open-sourceai-agentsfedora

technical Jun 11th, 2026

A 2-cent transfer was enough to hijack a banking AI assistant

Security firm Blue41 showed that bunq's AI assistant could be turned into a phishing channel by hiding a prompt-injection payload in a transaction description. The attacker needs no malware and no access to the victim's device, just the ability to send money.

blue41.com

prompt-injectionsecuritybanking