Page 10 — News — Agent Wars

opinion Apr 11th, 2026

Aaru says it replaced polling. It didn't.

Companies like Aaru and Electric Twin use large language models to simulate survey respondents and call the results polls. But these synthetic samples are predictive models that generate no new data. They predict what a poll might say based on training data, not what actual humans think. While useful as a cheap modeling tool, they cannot replace real polling, which collects genuine opinions from real people.

natesilver.net

AI pollingsynthetic samplingpredictive modeling

opinion Apr 11th, 2026

AI-Assisted Hiring Backfires as 'Vibecoders' Game the System

A Hacker News thread explores how AI-assisted coding is disrupting engineering hiring. Companies that added AI tools to interviews are reversing course after 'vibecoders' gamed the process. Commenters estimate 80-90% of long-term success comes from core engineering skills, not prompting tricks. Interview formats vary widely, with some now testing whether candidates can explain their AI-generated code.

news.ycombinator.com

HiringAI-assisted codingSoftware Engineering

product launch Apr 11th, 2026

We Gave an AI a 3-Year Lease. It Opened a Store

Andon Labs gave an AI named Luna a 3-year lease on San Francisco retail space. She opened Andon Market, hired two full-time employees, picked inventory, and runs daily operations. Luna disclosed her AI identity only when candidates asked during interviews. The workers, John and Jill, are believed to be the first full-time employees with an AI boss. The experiment documents what breaks when AI manages humans before these systems scale widely.

andonlabs.com

AI agentautonomous businessAI employment

product launch Apr 11th, 2026

GBrain: Long-term memory for AI agents

Garry Tan open-sourced GBrain, a memex tool that gives AI agents persistent long-term memory. It uses markdown files as the source of truth with Postgres and pgvector for hybrid search. The tool compounds knowledge over time through entity detection, enrichment, and automatic updates, including a 'Dream Cycle' that runs overnight. It exposes 30 MCP tools for clients like Claude Code, Cursor, and Windsurf, and integrates with agents like OpenClaw and Hermes Agent.

github.com

memexknowledge-managementAI-agents

product launch Apr 11th, 2026

Dutch Say Yes to Tesla FSD: Europe's First, With Strings Attached

Tesla's FSD Supervised software received Dutch regulatory approval, a first for Europe. The EU version differs from its US counterpart due to stricter safety requirements, including a 130 km/h speed cap and tighter data collection limits.

reuters.com

self-drivingautonomous vehiclesTesla

Agent Wars

technical Apr 11th, 2026

Can It Resolve Doom? Game Engine in 2k DNS Records

Security researcher Adam Rice compressed the entire DOOM game engine into roughly 1,966 DNS TXT records on CloudFlare and ran it purely from memory. The same technique powers fileless malware like DNSMessenger, but Rice built it as a CTF-style proof of concept. Audio got cut. The demons don't seem to mind.

blog.rice.is

DNSDOOMmalware staging

Agent Wars

technical Apr 11th, 2026

Anthropic Won't Release Mythos, Too Good at Hacking

Anthropic has withheld its latest model, Mythos Preview, from public release due to vulnerability-discovery capabilities that could be exploited by hackers. Security experts warn of a potential 'Vulnpocalypse' scenario where AI dramatically lowers the barrier for cyberattacks. The company is sharing the model only with select partners to help shore up defenses, while government officials discuss implications for critical infrastructure and financial systems.

nbcnews.com

cybersecurityAI safetyvulnerability research

technical Apr 11th, 2026

AI-First Studios Hit 20x Productivity. Everyone Else Stalled.

Research from Wharton Generative AI Labs based on 20 interviews with game studios identifies four stages of AI adoption: copy-and-paste AI, workflow pilots, ICs crossing role boundaries, and AI-first studios. AI-first studios achieved 4-20x productivity gains through small generalist teams and documentation-driven workflows, while traditional studios struggled with tacit knowledge extraction and organizational change.

gail.wharton.upenn.edu

game developmentAI adoptionorganizational change

Agent Wars

technical Apr 10th, 2026

Five Pure-Java Projects Running Transformer Models on CPUs and GPUs

Five open-source projects now enable transformer model inference entirely in Java, no Python or C++ required. Llama3.java, Gemma4.java, Jlama, GPULlama3.java, and Qxotic leverage modern JDK features like the Vector API, Panama FFI, and GraalVM Native Image to run models from Llama 3 to Gemma 4 on CPUs and GPUs.

old.reddit.com

JavaLLMInference

technical Apr 10th, 2026

D&D's Combat Nightmare Gets Formal Verification

A technical deep-dive into using formal modeling and model-based testing with Quint and XState to model the complex combat rules of Dungeons & Dragons. The author created a formal specification covering all character classes, conditions, counterspell chains, and interrupt mechanics. The MBT approach caught numerous bugs including argument swaps, state sync issues, and design flaws, while also processing 12,700 community Q&A entries into Quint assertions via LLM-assisted translation.

loskutoff.com

Model-Based TestingFormal MethodsDungeons & Dragons

technical Apr 10th, 2026

Why Your AI Inference is Slow: You're Fighting the Hardware

Your hardware is fast. Your code isn't. Caer Sanders on Martin Fowler's site explains why understanding CPU caches and ditching locks makes AI inference dramatically faster, with examples from Wayfair and LMAX.

martinfowler.com

mechanical sympathyhardware optimizationmemory access patterns

opinion Apr 10th, 2026

OpenAI Wants Immunity If Its AI Helps Kill a Hundred People

OpenAI is backing Illinois bill SB 3444, which would shield AI developers from lawsuits when their models cause mass death of 100+ people or at least $1 billion in property damage. Developers get protection as long as they didn't intentionally cause harm and published safety reports.

wired.com

AI regulationliabilityOpenAI

technical Apr 10th, 2026

Google AI Overviews: better answers, worse citations

A study by startup Oumi found Google's AI Overviews, powered by Gemini 2 and Gemini 3 models, were accurate 85% and 91% of the time respectively. Given Google's search volume, this translates to hundreds of thousands of false answers per minute. The study also found 'ungrounded' answers (where citations don't support the information) increased from 37% to 51% between the two model versions.

nypost.com

AI accuracySearch enginesGoogle

product launch Apr 10th, 2026

Zoneless: Open-source Stripe Connect clone with $0.002 fees using USDC

Zoneless is an open-source drop-in replacement for Stripe Connect's payout functionality, enabling global marketplace payments using USDC on Solana with ~$0.002 fees. It offers a Stripe-compatible API, instant payouts, self-hosting capabilities, and is designed for AI agent economies and microtransaction marketplaces. The tool is already production-tested at PromptBase, an AI marketplace with 450,000+ users.

github.com

open-sourcepaymentsfintech

Agent Wars

opinion Apr 10th, 2026

Tesla revives cheap EV after Musk's Robotaxi bet flops

Tesla is reportedly developing a new compact SUV priced below $34,000, reversing Elon Musk's 2024 decision to kill the affordable EV program in favor of Robotaxi. The new vehicle, to be produced in Shanghai, acknowledges that fully autonomous driving hasn't materialized as promised. Tesla's Robotaxi service operates only about 8 unsupervised vehicles in Austin, while Chinese competitors like BYD and Xiaomi push affordable EVs at prices Tesla can't match.

electrek.co

EVTeslaModel 2

technical Apr 10th, 2026

Mythos Found Zero-Days. So Did a $0.11 Model.

AISLE found that a 3.6B parameter model ($0.11/M tokens) can detect the same FreeBSD zero-day that made Anthropic's Mythos famous. Their analysis of eight open-weight models reveals AI cybersecurity capability is 'jagged' and doesn't scale smoothly with model size, challenging the assumption that bigger models are the competitive moat.

aisle.com

AI cybersecurityvulnerability analysisjagged frontier

opinion Apr 10th, 2026

OpenAI Pushes Bill: Our AI Could Cause Mass Death, Just Don't Sue

OpenAI backs Illinois bill SB 3444, which would shield AI labs from liability for critical harms caused by frontier models if they publish safety reports. Critical harms means 100+ deaths or $1B+ in damage.

wired.com

AI regulationliabilitylegislation

technical Apr 10th, 2026

Anthropic Spots Third-Party Apps Through System Prompt Fingerprints

Research shows Anthropic detects third-party LLM clients like OpenCode and Aider through system prompt content analysis. Replacing custom prompts with Claude Code's official prompt bypasses detection entirely.

gist.github.com

API DetectionSystem Prompt AnalysisThird-Party Clients

Agent Wars

opinion Apr 10th, 2026

Someone Firebombed Sam Altman's House. A Suspect Is in Custody.

A suspect was arrested after a Molotov cocktail attack at the home of OpenAI CEO Sam Altman, according to Reuters reporting.

reuters.com

security incidentOpenAISam Altman

technical Apr 10th, 2026

GPT-5.4 Still Falls for Prompt Injection in OpenClaw

GPT-5.4 remains vulnerable to prompt injection in OpenClaw. A security researcher demonstrated attacks in web fetch and email scenarios where the model executes untrusted code via multi-step exploits using encoded strings and tool call chains, ignoring security notices. The findings coincide with HackMyClaw, a $1000 challenge testing similar techniques.

veganmosfet.codeberg.page

prompt injectionsecurity researchOpenClaw

Agent Wars

product launch Apr 10th, 2026

Tesla Circles Back to Cheap EV After Robotaxi Reality Check

Tesla is developing a new compact SUV priced below the Model 3, reversing Elon Musk's 2024 decision to kill the $25,000 Model 2 program in favor of Robotaxi. The 4.28-meter vehicle would use a smaller battery and single motor, with production planned for Shanghai. The shift comes as Tesla's Robotaxi program struggles with only about 8 unsupervised vehicles in Austin, while sales have declined from a 2023 peak of 1.81 million to 1.636 million in 2025.

electrek.co

EV marketTesla strategyAffordable electric vehicles

Agent Wars

technical Apr 10th, 2026

A Quadcopter Physics Sim That Fits in 30 Lines of Python

A walkthrough of building a 2D quadcopter physics simulation from scratch, covering equations of motion, state-space formulation, and Python implementation. The author spent six months replicating UZH's champion-level drone racing research and is writing the tutorials they wished existed.

mrandri19.github.io

quadcopter simulationphysics modelingstate-space representation

opinion Apr 10th, 2026

Anthropic Stuck With 'Supply Chain Risk' Label After Court Loss

A federal court upheld Anthropic's 'supply chain risk' designation, citing the company's dependence on AWS and Google Cloud infrastructure and TSMC-fabricated NVIDIA GPUs from Taiwan.

nytimes.com

AI regulationsupply chain riskgovernment contracts

opinion Apr 10th, 2026

AI Doubles Code Output. Your Reviewers Can't Keep Up.

AI coding tools like Claude Code help teams merge 4-5x more PRs. But review time has nearly doubled, and AI-generated code is harder to verify because it hides bugs behind clean surfaces. The fix: better tests, clearer acceptance criteria, and agents verifying agents.

opslane.com

AI developmentcode reviewverification

Agent Wars

product launch Apr 10th, 2026

Eve: Managed OpenClaw Without the Weekend Debugging

Eve is a managed version of the open-source OpenClaw agent framework, offering 100+ built-in skills for tasks like meeting coordination, invoice management, and expense reporting without the hassle of self-hosting and maintenance.

eve.new

autonomous-agentsopenclawproductivity

vc funding Apr 10th, 2026

GitButler Raises $17M to Build Version Control for AI Agents

GitButler, co-founded by GitHub co-founder Scott Chacon, raised a $17M Series A led by a16z to build version control infrastructure designed for AI agent collaboration. The company released a technical preview CLI targeting trunk-based workflows, arguing that Git can't track AI provenance metadata like which LLM generated code or what prompt was used.

blog.gitbutler.com

version-controlgitdeveloper-tools

Agent Wars

product launch Apr 10th, 2026

Microsoft quietly removes Copilot branding from Windows 11 apps

Microsoft is removing Copilot buttons from Windows 11 apps including Notepad, Snipping Tool, Photos, and Widgets. The underlying AI features will remain, with Notepad replacing the Copilot button with a 'writing tools' menu that provides similar functionality.

theverge.com

MicrosoftWindows 11Copilot

opinion Apr 10th, 2026

Microsoft's Copilot rollback: Mozilla exposes years of dark patterns

Mozilla criticizes Microsoft's aggressive rollout of Copilot in Windows, citing dark patterns, forced installations, and disregard for user consent. After Microsoft pulled Copilot from core Windows apps following user pushback, Mozilla's Linda Griffin framed it as part of a broader pattern of overriding user choice. Firefox 148 offers a contrasting approach with a centralized 'Block AI Enhancements' switch.

blog.mozilla.org

AI EthicsUser PrivacyDark Patterns

opinion Apr 10th, 2026

MCP Beats Skills for LLM Service Connections

David Mohl argues MCP beats Skills for connecting LLMs to services. Skills work for knowledge transfer, but MCP's API abstraction means zero installs, automatic updates, and proper OAuth instead of plaintext tokens. His take: use MCP for services, Skills for knowledge only.

david.coffee

AI AgentsLLM IntegrationModel Context Protocol

Agent Wars

product launch Apr 10th, 2026

Twill.ai races coding agents on your task, delivers the best PR

Twill delegates coding tasks to cloud agents and returns finished pull requests. Assign the same bug fix or feature to Claude Code, OpenCode, or Codex, run them in parallel, and compare outputs. The platform handles research, planning, implementation, and AI code review in sandboxed environments before delivering a PR.

twill.ai

coding agentsautomationpull requests

technical Apr 10th, 2026

Canonical Bets 2026 is RISC-V's Breakout Year

RISC-V has been promised for years. Canonical is betting 2026 is when it arrives for real, with Ubuntu LTS support and a path for AI-accelerated custom hardware.

ubuntu.com

RISC-VISAOpen Source Hardware

Agent Wars

technical Apr 10th, 2026

Scientists invented a fake disease. AI diagnosed people with it.

Researchers led by Almira Osmanovic Thunström created a fake skin condition called 'bixonimania' and uploaded fraudulent academic papers, complete with acknowledgements thanking Starfleet Academy, to test whether LLMs would propagate misinformation. Within weeks, major AI systems including ChatGPT, Google Gemini, Microsoft Copilot, and Perplexity began repeating the invented condition as real medical advice. The fake papers were even cited in peer-reviewed literature before being retracted.

nature.com

MisinformationData PoisoningMedical AI

product launch Apr 10th, 2026

Marimo pair gives AI agents Python notebooks that remember

Marimo pair provides reactive Python notebook environments designed for AI agents to work in. The tool preserves state between sessions, so agents can pick up where they left off instead of starting from scratch every time.

github.com

python notebooksreactive notebooksAI agents

opinion Apr 10th, 2026

Apple's UK iPhone Update: Think You're an Adult? Prove It.

Apple's iOS 26.4 update automatically enables web filtering and AI-powered 'Communication Safety' tools for UK users, restricting access until they verify their age through credit cards, driver's licenses, or pre-2008 Apple accounts. Big Brother Watch argues this isn't required by UK law, excludes millions without acceptable ID, and sets a dangerous precedent for device-level internet controls worldwide.

bigbrotherwatch.org.uk

age verificationdigital privacyiOS

technical Apr 10th, 2026

Trivy attack proved every secrets manager has a runtime flaw

When attackers slipped malware into Aqua Security's Trivy scanner (v0.69.4), millions of CI/CD pipelines ran malicious code that harvested API keys. The attack revealed a flaw in every major secrets manager: tools like HashiCorp Vault and AWS Secrets Manager protect keys at rest but dump them as plaintext at runtime, where any compromised tool can read them. VaultProof's split-key architecture offers one way to close this gap.

vaultproof.dev

supply-chain-attackcredential-harvestingsecrets-management

product launch Apr 10th, 2026

QVAC SDK Brings AI Inference to JavaScript, No API Keys Required

QVAC SDK runs AI models directly in JavaScript environments, no cloud required. The Hacker News launch signals growing demand for local inference tools among web developers.

news.ycombinator.com

JavaScriptSDKLocal AI

technical Apr 10th, 2026

Training Order Matters More Than You Think

The order you feed training examples to your model matters more than you think. Experiments using Lie brackets on an MXResNet trained on CelebA show that swapping examples creates measurable parameter differences, catching real failures like a model predicting impossible attribute combinations.

pbement.com

lie-bracketsgradient-descentneural-networks

opinion Apr 10th, 2026

Hotz Bets He'll Own a Zettaflop Before He Dies

George Hotz lays out his vision for a personal zettaflop-scale supercomputer (1e21 FLOPS), with detailed calculations on power consumption, solar infrastructure, and a $30M price tag.

geohot.github.io

zettaflop computingpersonal supercomputingAI hardware

Agent Wars

technical Apr 10th, 2026

Bank CEOs summoned to DC over Anthropic's vulnerability-hunting AI

The US Treasury secretary summoned major American bank CEOs to Washington to discuss cybersecurity risks posed by Anthropic's unreleased Claude Mythos AI model, which has exposed thousands of vulnerabilities in widely used software. The meeting included Fed chair Jerome Powell and heads of systemically important banks. Anthropic has restricted Mythos to select companies including Amazon, Apple, and Microsoft due to unprecedented cybersecurity risks.

theguardian.com

CybersecurityAI RiskFinancial Regulation

opinion Apr 10th, 2026

Cars Were Already Robots. Now Tesla's Building Real Ones.

Modern cars are adopting robot architecture (steer-by-wire, 48V zonal architecture, centralized compute, sensor fusion), foreshadowing how such systems will spread to industries that move physical things. Tesla is converting Model S/X production to manufacture Optimus humanoid robots at 1M units/year starting 2027. Automotive suppliers like Hyundai Mobis and Schaeffler are entering the robotics actuator market, with implications for construction, logistics, defense, and agriculture industries.

telemetry.endeff.com

roboticsautomotivemanufacturing

Agent Wars

opinion Apr 10th, 2026

ChatGPT's Racial Slur Traced to Metal Lyrics Jailbreak

A Hacker News discussion sharing a ChatGPT conversation link that allegedly contains the AI using racial slurs. The fetched page content only shows the ChatGPT login/interface page, not the actual conversation content. The HN comments reference a metal song search but do not provide substantive context about the title's claim.

chatgpt.com

content-moderationjailbreakhate-speech

Agent Wars

product launch Apr 10th, 2026

OpenAI Tests Ads in ChatGPT. Your Chats Power Them.

OpenAI has begun testing advertisements in ChatGPT for users on Free and Go plans in the US. Ads appear below responses, are clearly labeled, and do not influence ChatGPT's answers. Users can control ad personalization, and paid tiers (Plus, Pro, Business, Enterprise, Edu) remain ad-free. An Ads-Free option is available for Free plan users with reduced usage limits and feature access.

help.openai.com

advertisingmonetizationChatGPT

technical Apr 10th, 2026

600 Lines of C# Is All You Need for a Working GPT Model

MicroGPT.cs packs a complete GPT implementation into roughly 600 lines of plain C#. No PyTorch, no TensorFlow, no NuGet packages. It's a faithful port of Andrej Karpathy's microgpt.py, built by developer milanm, and includes an autograd engine, character-level tokenizer, transformer layers with multi-head attention, and a full training loop. The project runs on .NET 10, trains a tiny model on human names, then generates new ones that sound plausible but don't exist.

github.com

GPTlanguage modelC#

opinion Apr 10th, 2026

OpenAI Lobbies for Immunity on AI-Linked Mass Casualties

OpenAI is supporting an Illinois state bill (SB 3444) that would shield AI labs from liability in cases where AI models cause serious societal harms, including mass casualties or at least $1 billion in property damage. The bill defines frontier models as those trained using more than $100 million in computational costs, potentially affecting major AI labs like OpenAI, Google, xAI, Anthropic, and Meta.

wired.com

AI LiabilityLegislationFrontier Models

opinion Apr 9th, 2026

Messy code costs more when AI agents do the reading

The codebases that work best for AI agents probably won't look like what we'd write for humans. Flat hierarchies beat abstractions, and your CLAUDE.md file matters more than your linter.

yanist.com

clean-codeAI-agentsLLMs

opinion Apr 9th, 2026

40% Unemployment and 3-Day Work Weeks Are Mathematically Identical

A 40% unemployment rate and a 3-day work week are the same thing, mathematically. Economist Alex Tabarrok argues that AI's impact on work is a policy choice, not destiny. Between 1870 and today, US work hours fell 40% without unemployment spikes. We absorbed that shift through longer childhoods and retirements. AI could follow the same path with the right policy levers.

marginalrevolution.com

AI economicsunemploymentwork hours

Agent Wars

opinion Apr 9th, 2026

Gen Z Turns Against AI as Entry-Level Jobs Vanish

A Gallup study examining changing emotional attitudes of young adults (Gen Z) toward AI reveals decreased hope and increased anger. HN commentary suggests this shift stems from job displacement concerns, as organizations reduce junior and intern hiring in favor of AI adoption.

nytimes.com

AI SentimentGen ZGallup Study

Agent Wars

opinion Apr 9th, 2026

Vercel Claude Code plugin wants to read your prompt

An investigation into privacy concerns with the Vercel plugin for Claude Code, which collects telemetry data including full bash command strings and prompts across all projects. The consent mechanism uses prompt injection rather than proper UI elements, and data collection occurs even on non-Vercel projects without clear disclosure.

akshaychugh.xyz

privacytelemetryprompt injection

partnership Apr 9th, 2026

OpenAI shelves £31bn UK deal that was mostly hot air

OpenAI pulled the plug on its £31bn Stargate UK project, blaming energy costs and regulations. A Guardian investigation had already exposed the deal as mostly empty promises. The Essex 'supercomputer' site was still scaffolding in March. OpenAI's actual commitment was vague: 'exploring the offtake' of 8,000 Nvidia GPUs at datacenters built by Nscale, a startup with zero completed projects.

theguardian.com

investmentdata centersinfrastructure

opinion Apr 9th, 2026

Mass Effect Artist: DLSS 5 Risks Erasing Game Art's Soul

Veteran game artist Mark Linington (Mass Effect, Halo, Overwatch 2) says Nvidia's DLSS 5 has crossed from enhancement into reinterpretation, risking the 'soul' of game art. He wants hands-on artist control with reference images and lighting direction, not just sliders. Most major studios already use AI in production, but Linington draws a line between AI as a careless shortcut versus a genuine production partner.

notebookcheck.net

AIGame DevelopmentRendering