Page 20 — News — Agent Wars

product launch Apr 11th, 2026

Maki cuts coding agent token costs 40% with skeleton parsing

Maki is a lightweight Rust-based TUI coding agent that uses context token reduction techniques to cut costs by ~40% and run 2x faster. It parses code into skeletons to minimize token usage, uses a sandboxed Python interpreter for code execution, and employs tiered model selection (Haiku for research, Opus for architecture). Features include long-term memory, MCP support, full visibility into subagent operations, and Claude Code-compatible output.

maki.sh

coding agenttoken efficiencyTUI

technical Apr 11th, 2026

Clonts's Face-Tracking VLA Only Learned from Failure

Nathan Clonts built a vision-language-action model to track faces with pan/tilt servos via behavioral cloning. Initial training showed poor performance (20 degrees average error) because only 3% of training data contained high-error frames. By adding periodic disturbances that teleported the target face, he improved performance to under 5 degrees error, showing VLAs need failure in training, not perfection.

nathanclonts.com

RoboticsMachine LearningComputer Vision

opinion Apr 11th, 2026

Anthropic Built an AI That Hacks Like a Nation-State

Anthropic has announced Claude Mythos Preview, an AI model capable of finding thousands of cybersecurity vulnerabilities in major operating systems and browsers at a sophistication level previously only available to state-sponsored hacking cells. The company is restricting access to a consortium of major tech companies due to safety concerns. The article explores the geopolitical implications of such powerful AI tools and the broader trend of AI companies becoming major global powers.

theatlantic.com

AIcybersecurityhacking

product launch Apr 11th, 2026

Hormuz Havoc: AI bots took every top score in 24 hours

Hormuz Havoc (also called Presidential Panic) is a satirical game where AI bots overwhelmed human players within 24 hours of launch, demonstrating how cheap and easy it is to deploy agents at scale in complex environments.

hormuz-havoc.com

gamingAI agentsautonomous agents

technical Apr 11th, 2026

Tesla's Cabin Camera Is Now Guessing How Old You Are

Tesla's software update 2026.8.6 adds driver age estimation via cabin camera, spotted by Tesla hacker @greentheonly. The unannounced feature could affect safety systems and insurance pricing, but accuracy and bias concerns loom.

driveteslacanada.ca

TeslaSoftware UpdateCabin Camera

opinion Apr 11th, 2026

AI-Assisted Hiring Backfires as 'Vibecoders' Game the System

A Hacker News thread explores how AI-assisted coding is disrupting engineering hiring. Companies that added AI tools to interviews are reversing course after 'vibecoders' gamed the process. Commenters estimate 80-90% of long-term success comes from core engineering skills, not prompting tricks. Interview formats vary widely, with some now testing whether candidates can explain their AI-generated code.

news.ycombinator.com

HiringAI-assisted codingSoftware Engineering

product launch Apr 11th, 2026

Dutch Say Yes to Tesla FSD: Europe's First, With Strings Attached

Tesla's FSD Supervised software received Dutch regulatory approval, a first for Europe. The EU version differs from its US counterpart due to stricter safety requirements, including a 130 km/h speed cap and tighter data collection limits.

reuters.com

self-drivingautonomous vehiclesTesla

technical Apr 11th, 2026

AI-First Game Studios Ship in Weeks, Not Months

Wharton researchers interviewed 20 game studios and found AI adoption follows four predictable stages. Studios designed around AI from day one achieved 4-20x productivity gains with small generalist teams and documentation-driven workflows. The bottleneck? Tacit knowledge extraction. And AI still can't handle the messy human work of strategic alignment.

gail.wharton.upenn.edu

Generative AIGame DevelopmentOrganizational Design

product launch Apr 11th, 2026

Quien replaces five WHOIS tools and speaks JSON to agents

Quien is an interactive TUI for WHOIS, DNS, mail, SSL/TLS, HTTP headers, and tech stack detection. It uses RDAP-first lookups with WHOIS fallback, outputs JSON via subcommands for scripting, and can be added as an agent skill through skills.sh.

github.com

WHOISDNSTUI

opinion Apr 11th, 2026

Aaru says it replaced polling. It didn't.

Companies like Aaru and Electric Twin use large language models to simulate survey respondents and call the results polls. But these synthetic samples are predictive models that generate no new data. They predict what a poll might say based on training data, not what actual humans think. While useful as a cheap modeling tool, they cannot replace real polling, which collects genuine opinions from real people.

natesilver.net

AI pollingsynthetic samplingpredictive modeling

product launch Apr 11th, 2026

We Gave an AI a 3-Year Lease. It Opened a Store

Andon Labs gave an AI named Luna a 3-year lease on San Francisco retail space. She opened Andon Market, hired two full-time employees, picked inventory, and runs daily operations. Luna disclosed her AI identity only when candidates asked during interviews. The workers, John and Jill, are believed to be the first full-time employees with an AI boss. The experiment documents what breaks when AI manages humans before these systems scale widely.

andonlabs.com

AI agentautonomous businessAI employment

opinion Apr 10th, 2026

Apple's UK iPhone Update: Think You're an Adult? Prove It.

Apple's iOS 26.4 update automatically enables web filtering and AI-powered 'Communication Safety' tools for UK users, restricting access until they verify their age through credit cards, driver's licenses, or pre-2008 Apple accounts. Big Brother Watch argues this isn't required by UK law, excludes millions without acceptable ID, and sets a dangerous precedent for device-level internet controls worldwide.

bigbrotherwatch.org.uk

age verificationdigital privacyiOS

product launch Apr 10th, 2026

Tesla Circles Back to Cheap EV After Robotaxi Reality Check

Tesla is developing a new compact SUV priced below the Model 3, reversing Elon Musk's 2024 decision to kill the $25,000 Model 2 program in favor of Robotaxi. The 4.28-meter vehicle would use a smaller battery and single motor, with production planned for Shanghai. The shift comes as Tesla's Robotaxi program struggles with only about 8 unsupervised vehicles in Austin, while sales have declined from a 2023 peak of 1.81 million to 1.636 million in 2025.

electrek.co

EV marketTesla strategyAffordable electric vehicles

technical Apr 10th, 2026

Trivy attack proved every secrets manager has a runtime flaw

When attackers slipped malware into Aqua Security's Trivy scanner (v0.69.4), millions of CI/CD pipelines ran malicious code that harvested API keys. The attack revealed a flaw in every major secrets manager: tools like HashiCorp Vault and AWS Secrets Manager protect keys at rest but dump them as plaintext at runtime, where any compromised tool can read them. VaultProof's split-key architecture offers one way to close this gap.

vaultproof.dev

supply-chain-attackcredential-harvestingsecrets-management

opinion Apr 10th, 2026

ChatGPT's Racial Slur Traced to Metal Lyrics Jailbreak

A Hacker News discussion sharing a ChatGPT conversation link that allegedly contains the AI using racial slurs. The fetched page content only shows the ChatGPT login/interface page, not the actual conversation content. The HN comments reference a metal song search but do not provide substantive context about the title's claim.

chatgpt.com

content-moderationjailbreakhate-speech

opinion Apr 10th, 2026

Tesla revives cheap EV after Musk's Robotaxi bet flops

Tesla is reportedly developing a new compact SUV priced below $34,000, reversing Elon Musk's 2024 decision to kill the affordable EV program in favor of Robotaxi. The new vehicle, to be produced in Shanghai, acknowledges that fully autonomous driving hasn't materialized as promised. Tesla's Robotaxi service operates only about 8 unsupervised vehicles in Austin, while Chinese competitors like BYD and Xiaomi push affordable EVs at prices Tesla can't match.

electrek.co

EVTeslaModel 2

product launch Apr 10th, 2026

Microsoft quietly removes Copilot branding from Windows 11 apps

Microsoft is removing Copilot buttons from Windows 11 apps including Notepad, Snipping Tool, Photos, and Widgets. The underlying AI features will remain, with Notepad replacing the Copilot button with a 'writing tools' menu that provides similar functionality.

theverge.com

MicrosoftWindows 11Copilot

technical Apr 10th, 2026

D&D's Combat Nightmare Gets Formal Verification

A technical deep-dive into using formal modeling and model-based testing with Quint and XState to model the complex combat rules of Dungeons & Dragons. The author created a formal specification covering all character classes, conditions, counterspell chains, and interrupt mechanics. The MBT approach caught numerous bugs including argument swaps, state sync issues, and design flaws, while also processing 12,700 community Q&A entries into Quint assertions via LLM-assisted translation.

loskutoff.com

Model-Based TestingFormal MethodsDungeons & Dragons

opinion Apr 10th, 2026

Someone Firebombed Sam Altman's House. A Suspect Is in Custody.

A suspect was arrested after a Molotov cocktail attack at the home of OpenAI CEO Sam Altman, according to Reuters reporting.

reuters.com

security incidentOpenAISam Altman

product launch Apr 10th, 2026

Eve: Managed OpenClaw Without the Weekend Debugging

Eve is a managed version of the open-source OpenClaw agent framework, offering 100+ built-in skills for tasks like meeting coordination, invoice management, and expense reporting without the hassle of self-hosting and maintenance.

eve.new

autonomous-agentsopenclawproductivity

technical Apr 10th, 2026

A Quadcopter Physics Sim That Fits in 30 Lines of Python

A walkthrough of building a 2D quadcopter physics simulation from scratch, covering equations of motion, state-space formulation, and Python implementation. The author spent six months replicating UZH's champion-level drone racing research and is writing the tutorials they wished existed.

mrandri19.github.io

quadcopter simulationphysics modelingstate-space representation

technical Apr 10th, 2026

Bank CEOs summoned to DC over Anthropic's vulnerability-hunting AI

The US Treasury secretary summoned major American bank CEOs to Washington to discuss cybersecurity risks posed by Anthropic's unreleased Claude Mythos AI model, which has exposed thousands of vulnerabilities in widely used software. The meeting included Fed chair Jerome Powell and heads of systemically important banks. Anthropic has restricted Mythos to select companies including Amazon, Apple, and Microsoft due to unprecedented cybersecurity risks.

theguardian.com

CybersecurityAI RiskFinancial Regulation

opinion Apr 10th, 2026

Hotz Bets He'll Own a Zettaflop Before He Dies

George Hotz lays out his vision for a personal zettaflop-scale supercomputer (1e21 FLOPS), with detailed calculations on power consumption, solar infrastructure, and a $30M price tag.

geohot.github.io

zettaflop computingpersonal supercomputingAI hardware

opinion Apr 10th, 2026

Cars Were Already Robots. Now Tesla's Building Real Ones.

Modern cars are adopting robot architecture (steer-by-wire, 48V zonal architecture, centralized compute, sensor fusion), foreshadowing how such systems will spread to industries that move physical things. Tesla is converting Model S/X production to manufacture Optimus humanoid robots at 1M units/year starting 2027. Automotive suppliers like Hyundai Mobis and Schaeffler are entering the robotics actuator market, with implications for construction, logistics, defense, and agriculture industries.

telemetry.endeff.com

roboticsautomotivemanufacturing

product launch Apr 10th, 2026

Zoneless: Open-source Stripe Connect clone with $0.002 fees using USDC

Zoneless is an open-source drop-in replacement for Stripe Connect's payout functionality, enabling global marketplace payments using USDC on Solana with ~$0.002 fees. It offers a Stripe-compatible API, instant payouts, self-hosting capabilities, and is designed for AI agent economies and microtransaction marketplaces. The tool is already production-tested at PromptBase, an AI marketplace with 450,000+ users.

github.com

open-sourcepaymentsfintech

opinion Apr 10th, 2026

OpenAI Wants Immunity If Its AI Helps Kill a Hundred People

OpenAI is backing Illinois bill SB 3444, which would shield AI developers from lawsuits when their models cause mass death of 100+ people or at least $1 billion in property damage. Developers get protection as long as they didn't intentionally cause harm and published safety reports.

wired.com

AI regulationliabilityOpenAI

technical Apr 10th, 2026

Five Pure-Java Projects Running Transformer Models on CPUs and GPUs

Five open-source projects now enable transformer model inference entirely in Java, no Python or C++ required. Llama3.java, Gemma4.java, Jlama, GPULlama3.java, and Qxotic leverage modern JDK features like the Vector API, Panama FFI, and GraalVM Native Image to run models from Llama 3 to Gemma 4 on CPUs and GPUs.

old.reddit.com

JavaLLMInference

technical Apr 10th, 2026

Training Order Matters More Than You Think

The order you feed training examples to your model matters more than you think. Experiments using Lie brackets on an MXResNet trained on CelebA show that swapping examples creates measurable parameter differences, catching real failures like a model predicting impossible attribute combinations.

pbement.com

lie-bracketsgradient-descentneural-networks

technical Apr 10th, 2026

Scientists invented a fake disease. AI diagnosed people with it.

Researchers led by Almira Osmanovic Thunström created a fake skin condition called 'bixonimania' and uploaded fraudulent academic papers, complete with acknowledgements thanking Starfleet Academy, to test whether LLMs would propagate misinformation. Within weeks, major AI systems including ChatGPT, Google Gemini, Microsoft Copilot, and Perplexity began repeating the invented condition as real medical advice. The fake papers were even cited in peer-reviewed literature before being retracted.

nature.com

MisinformationData PoisoningMedical AI

technical Apr 10th, 2026

Anthropic Spots Third-Party Apps Through System Prompt Fingerprints

Research shows Anthropic detects third-party LLM clients like OpenCode and Aider through system prompt content analysis. Replacing custom prompts with Claude Code's official prompt bypasses detection entirely.

gist.github.com

API DetectionSystem Prompt AnalysisThird-Party Clients

opinion Apr 10th, 2026

OpenAI Pushes Bill: Our AI Could Cause Mass Death, Just Don't Sue

OpenAI backs Illinois bill SB 3444, which would shield AI labs from liability for critical harms caused by frontier models if they publish safety reports. Critical harms means 100+ deaths or $1B+ in damage.

wired.com

AI regulationliabilitylegislation

technical Apr 10th, 2026

Mythos Found Zero-Days. So Did a $0.11 Model.

AISLE found that a 3.6B parameter model ($0.11/M tokens) can detect the same FreeBSD zero-day that made Anthropic's Mythos famous. Their analysis of eight open-weight models reveals AI cybersecurity capability is 'jagged' and doesn't scale smoothly with model size, challenging the assumption that bigger models are the competitive moat.

aisle.com

AI cybersecurityvulnerability analysisjagged frontier

opinion Apr 10th, 2026

Microsoft's Copilot rollback: Mozilla exposes years of dark patterns

Mozilla criticizes Microsoft's aggressive rollout of Copilot in Windows, citing dark patterns, forced installations, and disregard for user consent. After Microsoft pulled Copilot from core Windows apps following user pushback, Mozilla's Linda Griffin framed it as part of a broader pattern of overriding user choice. Firefox 148 offers a contrasting approach with a centralized 'Block AI Enhancements' switch.

blog.mozilla.org

AI EthicsUser PrivacyDark Patterns

product launch Apr 10th, 2026

Twill.ai races coding agents on your task, delivers the best PR

Twill delegates coding tasks to cloud agents and returns finished pull requests. Assign the same bug fix or feature to Claude Code, OpenCode, or Codex, run them in parallel, and compare outputs. The platform handles research, planning, implementation, and AI code review in sandboxed environments before delivering a PR.

twill.ai

coding agentsautomationpull requests

technical Apr 10th, 2026

GPT-5.4 Still Falls for Prompt Injection in OpenClaw

GPT-5.4 remains vulnerable to prompt injection in OpenClaw. A security researcher demonstrated attacks in web fetch and email scenarios where the model executes untrusted code via multi-step exploits using encoded strings and tool call chains, ignoring security notices. The findings coincide with HackMyClaw, a $1000 challenge testing similar techniques.

veganmosfet.codeberg.page

prompt injectionsecurity researchOpenClaw

technical Apr 10th, 2026

Canonical Bets 2026 is RISC-V's Breakout Year

RISC-V has been promised for years. Canonical is betting 2026 is when it arrives for real, with Ubuntu LTS support and a path for AI-accelerated custom hardware.

ubuntu.com

RISC-VISAOpen Source Hardware

opinion Apr 10th, 2026

Anthropic Stuck With 'Supply Chain Risk' Label After Court Loss

A federal court upheld Anthropic's 'supply chain risk' designation, citing the company's dependence on AWS and Google Cloud infrastructure and TSMC-fabricated NVIDIA GPUs from Taiwan.

nytimes.com

AI regulationsupply chain riskgovernment contracts

opinion Apr 10th, 2026

AI Doubles Code Output. Your Reviewers Can't Keep Up.

AI coding tools like Claude Code help teams merge 4-5x more PRs. But review time has nearly doubled, and AI-generated code is harder to verify because it hides bugs behind clean surfaces. The fix: better tests, clearer acceptance criteria, and agents verifying agents.

opslane.com

AI developmentcode reviewverification

product launch Apr 10th, 2026

Marimo pair gives AI agents Python notebooks that remember

Marimo pair provides reactive Python notebook environments designed for AI agents to work in. The tool preserves state between sessions, so agents can pick up where they left off instead of starting from scratch every time.

github.com

python notebooksreactive notebooksAI agents

technical Apr 10th, 2026

600 Lines of C# Is All You Need for a Working GPT Model

MicroGPT.cs packs a complete GPT implementation into roughly 600 lines of plain C#. No PyTorch, no TensorFlow, no NuGet packages. It's a faithful port of Andrej Karpathy's microgpt.py, built by developer milanm, and includes an autograd engine, character-level tokenizer, transformer layers with multi-head attention, and a full training loop. The project runs on .NET 10, trains a tiny model on human names, then generates new ones that sound plausible but don't exist.

github.com

GPTlanguage modelC#

product launch Apr 10th, 2026

QVAC SDK Brings AI Inference to JavaScript, No API Keys Required

QVAC SDK runs AI models directly in JavaScript environments, no cloud required. The Hacker News launch signals growing demand for local inference tools among web developers.

news.ycombinator.com

JavaScriptSDKLocal AI

product launch Apr 10th, 2026

OpenAI Tests Ads in ChatGPT. Your Chats Power Them.

OpenAI has begun testing advertisements in ChatGPT for users on Free and Go plans in the US. Ads appear below responses, are clearly labeled, and do not influence ChatGPT's answers. Users can control ad personalization, and paid tiers (Plus, Pro, Business, Enterprise, Edu) remain ad-free. An Ads-Free option is available for Free plan users with reduced usage limits and feature access.

help.openai.com

advertisingmonetizationChatGPT

technical Apr 10th, 2026

Google AI Overviews: better answers, worse citations

A study by startup Oumi found Google's AI Overviews, powered by Gemini 2 and Gemini 3 models, were accurate 85% and 91% of the time respectively. Given Google's search volume, this translates to hundreds of thousands of false answers per minute. The study also found 'ungrounded' answers (where citations don't support the information) increased from 37% to 51% between the two model versions.

nypost.com

AI accuracySearch enginesGoogle

technical Apr 10th, 2026

Why Your AI Inference is Slow: You're Fighting the Hardware

Your hardware is fast. Your code isn't. Caer Sanders on Martin Fowler's site explains why understanding CPU caches and ditching locks makes AI inference dramatically faster, with examples from Wayfair and LMAX.

martinfowler.com

mechanical sympathyhardware optimizationmemory access patterns

opinion Apr 10th, 2026

OpenAI Lobbies for Immunity on AI-Linked Mass Casualties

OpenAI is supporting an Illinois state bill (SB 3444) that would shield AI labs from liability in cases where AI models cause serious societal harms, including mass casualties or at least $1 billion in property damage. The bill defines frontier models as those trained using more than $100 million in computational costs, potentially affecting major AI labs like OpenAI, Google, xAI, Anthropic, and Meta.

wired.com

AI LiabilityLegislationFrontier Models

vc funding Apr 10th, 2026

GitButler Raises $17M to Build Version Control for AI Agents

GitButler, co-founded by GitHub co-founder Scott Chacon, raised a $17M Series A led by a16z to build version control infrastructure designed for AI agent collaboration. The company released a technical preview CLI targeting trunk-based workflows, arguing that Git can't track AI provenance metadata like which LLM generated code or what prompt was used.

blog.gitbutler.com

version-controlgitdeveloper-tools

opinion Apr 10th, 2026

MCP Beats Skills for LLM Service Connections

David Mohl argues MCP beats Skills for connecting LLMs to services. Skills work for knowledge transfer, but MCP's API abstraction means zero installs, automatic updates, and proper OAuth instead of plaintext tokens. His take: use MCP for services, Skills for knowledge only.

david.coffee

AI AgentsLLM IntegrationModel Context Protocol

opinion Apr 9th, 2026

Maine bans big data centers. They probably weren't coming anyway.

The nation's first statewide moratorium on large data centers is advancing through Maine's legislature, blocking permits for facilities drawing over 20 megawatts until November 2027. The move responds to grid strain and rising electricity costs from AI infrastructure buildup. Data centers now consume 4% of U.S. electricity, a figure projected to double by 2030, and similar restrictions are emerging elsewhere.

gadgetreview.com

Data CentersAI InfrastructureRegulation

opinion Apr 9th, 2026

Vercel Claude Code plugin wants to read your prompt

An investigation into privacy concerns with the Vercel plugin for Claude Code, which collects telemetry data including full bash command strings and prompts across all projects. The consent mechanism uses prompt injection rather than proper UI elements, and data collection occurs even on non-Vercel projects without clear disclosure.

akshaychugh.xyz

privacytelemetryprompt injection

product launch Apr 9th, 2026

Relvy Automates On-Call Runbooks Because Nobody Updates Your Wiki

Relvy, a Y Combinator Winter 2024 startup, wants to kill the runbooks sitting forgotten in your Notion workspace. Built by two former Samsara engineers, the platform handles routine incidents automatically while escalating the tricky ones to humans. Hacker News users confirmed the problem is real, but questioned whether specialized tools can hold off general AI agents.

relvy.ai

DevOpsSREOn-call Automation