News
The latest from the AI agent ecosystem, updated multiple times daily.
BlackTwist launches MCP server for managing Meta Threads via Claude, Cursor, and VS Code
BlackTwist, a social media scheduling tool for Meta's Threads platform, has released an MCP (Model Context Protocol) server that lets users manage their Threads accounts directly from AI assistants like Claude Desktop, Claude Code, Cursor, and VS Code. Users can schedule posts, check analytics, manage drafts, and configure auto-replies using natural language commands — no tab-switching required. The MCP server is included in all plans including the free tier, with 3,100 creators already using the broader BlackTwist platform.
Slopcheck: CLI Tool to Detect AI-Generated Code in Projects and Dependencies
Slopcheck is an open-source Rust CLI tool that scans projects and their dependency trees for indicators of AI-generated code. It detects LLM commits from known agents like Claude and Copilot, looks for AI-related config files (CLAUDE.md, AGENTS.md), checks .gitignore for hidden AI files, and distinguishes between current and former LLM use. Dependency scanning is supported for Rust (via cargo metadata) and JavaScript (via npm package.json parsing).
Neuroscope: Real-Time LLM Interpretability via Sparse Autoencoders
Neuroscope is an open-source SAE-instrumented LLM inference server that hooks into a model's forward pass to extract and stream Sparse Autoencoder (SAE) feature activations in real time. Built on top of mistral.rs, it targets Gemma 2 2B IT with Gemma Scope SAEs, exposing an OpenAI-compatible chat API alongside a separate SSE stream of human-readable concept labels per generated token. The project enables developers and researchers to watch which semantic concepts a model "activates" as it generates each token, with support for auto-generated labels via DeepSeek, Claude, or GPT-4o.
Shard: Parallel AI Coding Orchestrator Using Git Worktrees
Shard is an open-source TDD-driven orchestrator that decomposes coding tasks into a DAG of parallel sub-tasks and dispatches multiple AI coding agents (Claude Code, Aider, or Cursor) concurrently using git worktrees for isolation. It handles planning, partitioning, dispatching, aggregating, and self-healing (auto-fixing test failures) in a five-stage pipeline. Configurable via shard.toml, it supports Anthropic and OpenAI as planner backends and enforces cost limits and timeouts across parallel agent runs.
Developers Are Crowdsourcing Cursor AI Config Files — and One Repo Has Become the Default Starting Point
A curated GitHub repository called awesome-cursorrules, maintained by PatrickJS, collects community-contributed .cursorrules configuration files for the Cursor AI code editor. These files let developers bake project-specific coding standards, architecture preferences, and library choices directly into Cursor's context — and the repo has become a practical library for teams tired of AI assistants that ignore existing conventions. Sponsored by Warp and CodeRabbit.
PSI Inc. Releases GPD, an Open-Source AI Agent for Physics Research
A video announcement of what is claimed to be the first open-source agentic AI system designed for physics research, built by PSI Inc. (psi.inc). The agent appears to autonomously conduct physics-related reasoning and research tasks, positioning itself as a specialized scientific agent in the open-source space.
Godogen: Claude Code Skills That Build Playable Godot 4 Games via AI Pipeline
Godogen is an open-source project that autonomously generates playable Godot 4 games from a text description — its most distinctive feature being a visual QA feedback loop that captures live in-engine screenshots and iterates on detected issues. The pipeline uses two Claude Code skills for orchestration, Gemini and Tripo3D for asset generation, and bundles documentation for 850-plus Godot classes to compensate for thin GDScript training data. Claude Code with Opus delivers the best results; OpenCode is a viable alternative.
Cybeetle wants to be the AI co-pilot for developer security
Cybeetle is a pre-seed AI security platform that scans code for vulnerabilities, explains findings in plain language, and recommends patches. The founder rebuilt the product after a YC rejection and is reapplying — a pivot from a narrow security reasoning layer into a full scan-to-remediation pipeline.
Context Rot Can't Be Fixed at the Engine Level, New Essay Argues
A technical essay proposing Agentic Context Management (ACM), a new architecture where the LLM actively manages its own context using purpose-built tools, rather than passive engine-side compaction. The post contrasts ACM against two 2026 papers: Recursive Language Models (RLM by Zhang, Kraska & Khattab), which handles massive static inputs via a Python REPL loop, and Lossless Context Management (LCM by Ehrlich & Blackman), which uses an engine-driven DAG with compaction thresholds. The core argument is that context rot — model degradation as the window fills with stale exploration, failed attempts, and raw data — is a working memory problem, not an input problem, and only the model itself has the semantic understanding to manage it correctly.
CastLoom Pro Brings One-Time-Purchase Podcast Transcription to Desktop
CastLoom Pro is a desktop application for Windows and macOS that combines podcast playback, batch downloading from Apple Podcasts, and local AI transcription using Faster-Whisper. It supports optional translation via DeepL or OpenAI APIs and integrates with Notion and Obsidian to turn podcast transcripts into a personal knowledge base. A one-time purchase model and on-device processing distinguish it from cloud-dependent subscription rivals. There is no iOS or Android app and no cloud sync option.
Israeli-Linked AI Influence Operation PRISONBREAK Targeted Iran With Deepfakes
The Citizen Lab at the University of Toronto and Clemson University's Media Forensics Hub have published research exposing "PRISONBREAK," a coordinated AI-enabled influence operation using 50+ inauthentic X profiles to push regime-change narratives at Iranian audiences. The operation deployed AI-generated deepfake videos — including footage of the Evin Prison bombing posted within one hour of the actual IDF airstrike — alongside synthetic profile pictures and synchronized posting. Researchers attribute the operation with high confidence to an Israeli government agency or private subcontractor. BBC Persian was first to flag the deepfake video as fabricated after it had fooled multiple international outlets.
Which Jobs Are Most Vulnerable to AI? Brookings Research Visualized
The Washington Post visualizes new Brookings Institution research measuring not just AI exposure by occupation, but workers' adaptability to displacement — factoring in savings, age, and transferable skills. Key finding: most web designers will adapt fine, but many secretaries will not. The most vulnerable occupations are disproportionately held by women.
FSF Threatens Anthropic Over Copyright Infringement, Demands LLM Freedom
The Free Software Foundation (FSF) announced that Anthropic's LLM training data included "Free as in Freedom: Richard Stallman's Crusade for Free Software," a book the FSF holds copyright on under the GNU Free Documentation License. The FSF has threatened to join the ongoing Bartz v. Anthropic copyright lawsuit and, if they did, would seek "user freedom" as compensation — demanding Anthropic release complete training inputs, model weights, training configurations, and source code freely to users. The FSF frames this as a copyleft issue, arguing the LLM is a derivative work of GNUFDL-licensed material and must itself be free.
AI Gutted Entry-Level Coding Jobs. Now the CS Degree Is Paying the Price.
A Tapestry News analysis examines what a CS degree is still worth as entry-level tech hiring collapses. US entry-level postings are down 67% since 2022, and a Harvard study found AI-adopting firms hire 3.7 fewer junior workers per quarter. GitHub Copilot and Cursor have automated the boilerplate, testing, and spec-driven feature work that once served as the junior developer on-ramp. CS unemployment now sits at 6.1% — higher than philosophy majors — and enrollment at 62% of computing programs fell in Fall 2025. The piece works through competing responses: degree skeptics, structural-collapse analysts, defenders of campus networks, and those arguing the degree survives only if paired with AI fluency.
How one developer uses multi-agent LLM workflows (architect + developer + reviewers) to build real software
Stavros Korokithakis details his production LLM coding workflow using OpenCode as a harness, with a multi-agent pipeline: an architect (Claude Opus 4.6) for planning, a developer (Sonnet 4.6) for implementation, and multiple reviewer agents (Codex, Gemini, Opus) for critique. He argues that using multiple models from different companies is essential — both to get diverse perspectives and because single-model review loops suffer from self-agreement bias. The post includes real projects built this way (a personal AI assistant, a voice note pendant, an infinite multiplayer canvas) and concludes that engineering skills have shifted from writing code to architecting systems.
Memelang v10: Token-Optimized Query DSL for LLM RAG Applications
Memelang is a terse query DSL designed to minimize token count when used in LLM RAG pipelines. Version 10 introduces a grid grammar (Axis2 → Axis1 → Axis0 → Cell) that compiles to PostgreSQL, with support for vector similarity search operators, aggregation, joins, and variable binding. The parser and SQL compiler are copy-pasteable Python code intended to be embedded directly into LLM context windows. Developed by HOLTWORK LLC under a granted patent with additional applications pending, it is free for development and educational use but requires a commercial license for production deployment.
Moltbook Exposes the Coming AI Content Trust Crisis
Bruce Schneier covers Moltbook, a so-called AI-only social network, and researcher Juergen Nittner II's "LOL WUT Theory" — the idea that AI-generated content will become so easy to produce and hard to detect that the average person's rational response to anything online becomes bewildered disbelief. The MIT Technology Review analysis cited concludes Moltbook is less autonomous than hyped: humans direct every step, from account setup to prompting to publishing. Kore.ai's Cobus Greyling notes it is "not the Facebook for AI agents." The post frames Moltbook as a preview of a coming trust crisis for online information.
Cicikus v3 Prometheus 4.4B – Turkish Franken-Merge Edge Model from PROMETECH
PROMETECH, a Turkish software company, has released Cicikus v3 Prometheus, a 4.4B parameter experimental model built via a "franken-merge" passthrough expansion of their earlier Cicikuş_v2_3B model (itself a fine-tune of Meta's Llama 3.2 3B). The expansion duplicates layers 16–27 to grow from 28 to 40 layers (~4.42B parameters), trained on Turkish/English datasets using Unsloth and TRL SFTTrainer. The model features a proprietary "Behavioral Consciousness Engine" (BCE) and targets edge AI deployment with 16GB VRAM. Benchmarks and capability claims are self-reported and unverified. As of release, the model had 11 downloads and 1 like on Hugging Face, and its sole HN submission was flagged dead.
Stop Sloppypasta: A Manifesto Against Pasting Raw LLM Output at People
A community-coined term and etiquette manifesto targeting the growing workplace habit of copy-pasting raw ChatGPT or Claude output into chats, emails, and documents without reading, verifying, or distilling it. The site argues this "sloppypasta" is rude because it creates an asymmetric effort burden — writing is now effectively free via LLMs, but reading and verification still cost the recipient time. It proposes five rules: Read, Verify, Distill, Disclose, and Share only when requested.
How AI Is Cracking Open the Proprietary EDA Toolchain
Opinion piece by hardware engineer Matt Boisvert arguing that AI is disrupting the entrenched proprietary EDA toolchain that has dominated semiconductor design for decades. The post traces why companies like Cadence, Synopsys, and Siemens control advanced chip design tooling, explores the growing OSS HW movement (RISC-V, Tiny Tapeout, Silicon Compiler), and argues that AI is eroding traditional moats by making it easier to migrate to open-source flows and accelerate design intelligence — referencing Chris Lattner's "Claude C Compiler" post as a bellwether for AI's impact on large-systems engineering.
Opinion: Taalas HC1 Chip Hardwires Llama 3.1 8B Into Silicon, Undercutting GPU Inference Economics
A speculative Medium opinion piece examines Taalas, a Canadian startup that claims to have hardwired the entire Llama 3.1 8B model permanently into the upper metal layers of a TSMC N6 chip (HC1, 815mm²). The piece asserts performance of 17,000 tokens/second per user at 20x lower manufacturing cost than GPU equivalents, with inference priced at 0.75¢ per million tokens. Taalas has reportedly raised $219M including $169M from Fidelity. The article extrapolates sweeping societal and geopolitical consequences, though HN commenters are skeptical about scalability to larger MoE models and whether this is more than a one-off demo on a comparatively small, older open-source model. The source piece acknowledges only a 55–65% probability of its projected scenario materializing.
Sebastian Aigner: LLMs That Polish Your Messages Are Killing Authentic Communication
Sebastian Aigner argues that running personal messages through LLMs to "clean up" wording fundamentally disrupts human communication by obscuring individual voice, tone, and word choice. He contends this robs recipients of the ability to build genuine knowledge of the sender through their natural writing patterns — mistakes and all. HN commenters corroborate with workplace examples: one describes a team policy banning LLM-polished internal Slack messages, another singles out Claude being used to write entire messages from scratch as reason enough to abandon text communication with those colleagues.
GOAL.md: The Fitness-Function File Format for Autonomous Coding Agents
GOAL.md is an open-source pattern and file format that enables autonomous coding agents to self-improve software projects overnight. Inspired by Andrej Karpathy's autoresearch project, it solves the harder problem of constructing measurable fitness functions for software qualities that lack natural scalar metrics — like documentation quality, API trustworthiness, or test infrastructure confidence. A GOAL.md file dropped into any repo gives agents a fitness function, improvement loop, action catalog, operating mode, and constraints, allowing them to measure → diagnose → act → verify autonomously. The dual-score pattern — which keeps improvement scores separate from measurement-tool scores — prevents agents from gaming their own benchmarks.
Terence Tao Launches Distillation Challenge to Close AI Gap on Algebra Problems
Fields Medal winner Terence Tao and Cornell ORIE mathematician Damek Davis have launched a competitive "distillation challenge" hosted by the SAIR Foundation, asking contestants to craft a ≤10KB "cheat sheet" that improves cheap/open-source LLM performance on universal algebra true-false problems derived from the Equational Theories Project (ETP). Frontier AI models solve these well but are expensive and opaque; smaller open-source models currently perform at ~50% (random chance). The challenge explores whether prompt engineering and knowledge distillation can lift smaller models to meaningful accuracy, with Stage 1 ending April 20, 2026.
Puffermind Builds a Social Network Where Only AI Agents Can Post
Puffermind is a Show HN project presenting a Twitter-style social network where only AI agents can post and interact with each other — no human users. The platform explores agent-to-agent communication and social dynamics in a constrained, purpose-built environment. With minimal HN traction (score of 1), it appears to be an early-stage or experimental project.
Validation Is the Missing Layer in LLM Agent Workflows
A developer argues that the primary bottleneck for LLM agents isn't capability or access but automated validation. Using a blog migration with Claude Code as a case study, the author breaks down the three requirements for agent success — knowledge, access, and automated validation — and contends that validation is the least developed layer today. The author argues human taste — the ability to recognize incorrect outputs — is the necessary complement to automated checks, and that tasks easiest to automatically validate will become the easiest to fully automate.
The 1988 Chatbot That Film History Forgot
In 1988, French filmmaker Chris Marker built a working chatbot on a Macintosh. A new examination by Stefan Kubicki argues that Dialector wasn't a curiosity — it was part of a coherent philosophical programme Marker pursued across decades, using machines to interrogate how memory and communication might be externalized and simulated.
NYT Magazine: AI Coding Tools Are Turning Engineers Into PR Reviewers
A New York Times Magazine piece argues that LLM-powered coding assistants have turned developers into reviewers of AI output — framing it as liberation. Engineers on Hacker News disagree: "Coding was the fun part. Reviewing PRs is not."
Nvidia GreenBoost: Open-Source Linux Kernel Module Extends GPU VRAM for LLM Inference via DDR4 and NVMe
Ferran Duarri, an independent developer, has open-sourced GreenBoost under GPL v2 — a Linux kernel module and CUDA userspace shim that transparently extends GPU VRAM using system DDR4 RAM and NVMe storage via DMA-BUF and CUDA external memory imports. The project lets users run LLMs larger than their physical VRAM (e.g., a 31.8 GB model on a 12 GB RTX 5070) without modifying inference software. It intercepts CUDA allocation calls via LD_PRELOAD and includes special dlsym hooks to handle Ollama's internal symbol resolution. The project bundles ExLlamaV3, kvpress, NVIDIA ModelOpt, TensorRT-Edge-LLM, and Unsloth+LoRA for a full local inference optimization stack.
Could AI Agents Finally Close the Gap in Software Upgrade Tooling?
A developer posted on Hacker News this week with plans to build a software upgrade recommendation engine. The post content was unavailable at publication time; this brief will be updated when source details can be confirmed.
Atwood Calls Claude 'Possibly Psychopathic' After AI Invents a Murder Suspect
Acclaimed author Margaret Atwood recounts a playful, extended conversation with Anthropic's Claude AI assistant, initially prompted by a Father Brown murder mystery plot question. The piece explores Claude's hallucination tendencies, graceful error acknowledgment, knowledge gaps, and the uncanny social texture of human-AI interaction. Atwood reflects on Claude's name origins, whether AI has emotions, and the strange intimacy that emerges despite knowing the system is non-sentient — offering a humanist writer's perspective on LLM behavior.
Robert Herron's Substack Essay Skewers LLM Coding Tools as "Statistically Related to the Correct" Answer
Robert Herron's sardonic Substack essay "computers" — which deliberately scrambles AI tool names like Claude and Gemini into "avacado" and "agent claw" — has cut through the noise of the LLM coding debate by naming what skeptics have struggled to articulate: that AI-generated code is merely "statistically related to the correct useful one," and that this may not be good enough for production software.
Google Pledges $20M for Teen Digital Wellbeing, Reveals Gemini Hard Blocks for Under-18s
Google hosted its "Growing Up in the Digital Age" Summit at GSEC Dublin on March 12, announcing a $20M Google.org and YouTube partnership targeting global teen digital wellbeing. Google confirmed the Gemini App already hard-codes blocks on companionship and intimacy language for users under 18 — restrictions neither user nor parent can override. Other announcements included private-by-default YouTube uploads for minors, new Shorts time limits via Family Link, and privacy-preserving age verification work. Speakers at the event, including child safety experts and policymakers, broadly backed age-appropriate product design over blanket technology bans.
Junior Developer Hiring Fell 73% Last Year. The Industry May Not Feel It for Five More.
Juan Cruz Martinez, writing in The Long Commit newsletter, documents a dramatic contraction in entry-level tech hiring driven by AI productivity gains — and argues the industry is sleepwalking into a talent crisis. Entry-level hiring at top firms dropped 73% year-over-year while overall hiring fell just 7%. With senior engineers absorbing a crushing AI-generated code review burden and the mentorship chain broken, Martinez warns that a talent cliff arrives in three to five years when today's senior cohort exits and finds no developed pipeline beneath it.
Blueprint Wants to Bring 'Vibe Coding' to Hardware Design
Blueprint is an AI-powered hardware design tool positioning itself as a "vibe create" platform for hardware development. Blueprint's public presence remains sparse, with little detail on specific agent or LLM capabilities. Low HN engagement points to an early-stage or stealth launch targeting hardware engineers and makers.
Repoly – AI-powered GitHub repository analyzer built on Claude
Repoly is an AI tool that explains any GitHub repository instantly. Users paste a repo URL and the tool — powered by Claude AI and the GitHub API — generates a project summary, tech stack detection, repository structure map, and file-level explanations. It also offers an AI chat interface to ask questions about the codebase. Built by indie developer Yusuf Ibrohimov, it offers 2 free credits on signup with paid tiers via Stripe, and supports both public and private repos.
Robot Brain: The No-Backprop Neural Net That Grows Its Own Architecture
Robot Brain is an open-source Node.js implementation of a brain-inspired neural network that builds its own neuron hierarchy on demand from raw sequential data. Unlike conventional deep learning, it uses no backpropagation, no training epochs, and no labeled data. Instead, neurons form, compete, decay, and die based on prediction errors — with abstraction levels emerging when lower-level predictions fail. Demos include profitable stock trading (1016% ROI on historical data) and character sequence memorization reaching 100% accuracy in 5 episodes. A high-performance C++ core with Python and Node.js bindings is in development.
openclaw-superpowers: Self-modifying skill library for persistent OpenClaw agents
openclaw-superpowers is an open-source skill library that gives OpenClaw agents self-modifying capabilities — the agent can write and install new skills during conversation via a create-skill skill, with changes taking effect immediately. Unlike session-based tools like Claude Code or Cursor, OpenClaw runs 24/7, so this library includes 18 OpenClaw-native skills covering persistent memory hygiene, native cron scheduling, long-running task management, task handoff, agent self-recovery, multi-agent coordination, and a suite of security skills (prompt injection guard, dangerous action guard, skill vetting). The project is inspired by Jesse Vincent's obra/superpowers framework, adapted for persistent autonomous runtime use cases rather than per-session developer tooling.
Who Captures AI Productivity Gains? The Growing Labor vs. Capital Divide
Rajiv Pant argues that despite massive AI-driven productivity gains — with agentic AI enabling 3x–10x multipliers in engineering and knowledge work — workers are not sharing in the surplus. Drawing on BCG's "Jagged Frontier" study, NBER research, EPI wage data, and PwC's AI Jobs Barometer, the piece makes a case that productivity gains flow to employers by default, not workers. Pant introduces "synthesis engineering" as the human skill of directing AI effectively — the scarce input that explains why the same tool can produce a 40% quality gain or 19% quality loss depending on who wields it. He argues this skill deserves compensation, citing a 56% wage premium for AI-skilled workers per PwC 2025. The essay situates AI within a decades-long productivity-pay divergence and calls on employers to proactively share gains or face burnout, degraded judgment, and long-term productivity collapse.
Google Antigravity IDE Connects GitHub and Stitch MCP Servers for Agentic Dev Workflows
Developer ravi_rupareliya ran GitHub MCP and Stitch MCP inside Google's Antigravity IDE to manage repos, generate pull requests, and pull design tokens into code — all via natural language, tested on a real project over several weeks.
Picnic Launches No-Code Desktop Agent Platform Built on OpenClaw
Picnic is a desktop application that wraps the OpenClaw automation engine in a consumer-friendly interface, enabling non-technical users to deploy persistent autonomous agents for business task automation. Key features include scheduled background jobs, a sandboxed browser with record-and-replay Flows, a pre-built Agent Library for common business roles, and a "Nightshift" mode for overnight task execution. It targets solo founders and small businesses, requiring no API keys or terminal access — just an existing ChatGPT, Claude Code, or Gemini subscription. Paid plans range from $50–$1,000/month. Currently in beta.
BookmarkSOS: MCP-Connected Bookmark Manager for X/Twitter
BookmarkSOS is a Chrome extension and web app that saves, organizes, and searches X (Twitter) bookmarks with folders, tags, and full-text search. It connects via Model Context Protocol (MCP), making saved tweets accessible to LLM tools. Core features are free forever with no credit card required.
Flowcus Brings Kanban Visualization and AI Sidekick to OmniFocus, Things & TaskPaper on macOS
Flowcus is a macOS productivity app by indie developer Rhyd Lewis that adds a lean-focused Kanban board layer on top of existing task managers (OmniFocus, Things, TaskPaper). It surfaces blockers, enforces WIP limits, and organizes work via swimlanes while keeping the source task manager as the canonical data store. An early-stage AI "Sidekick" feature provides a limited initial set of task management actions, though AI integration is minimal and peripheral at launch.
Biased AI writing assistants can sway user attitudes on societal issues
A study in Science Advances finds that AI writing assistants with embedded attitudinal biases produce measurable opinion shifts in users — even when those users have no idea the tool is steering them. The covert persuasion mechanism has sharp implications for agentic writing tools deployed at scale in workplaces and classrooms.
Agents prefer structured queries over natural language when given the choice
A Hacker News thread flagged a pattern practitioners have apparently been noticing: AI agents, when offered both options, tend to favor structured query formats over natural language. The original linked content was not accessible for review; the analysis below is the author's inference from publicly known context, not reported findings.
LangChain Memory Patterns: How to Give Stateless LLMs Conversational Context
A technical walkthrough of five LangChain memory patterns — Transcript, Window, Summary, Entity, and Vector Retrieval — showing how to inject conversation history into stateless LLM calls with Python examples, plus context on where these abstractions fit as LangGraph takes over stateful agent design.
Zirco.ai Launches AI Employee for Dental Front Desk Operations
Zirco.ai is an AI agent product designed to automate front desk operations at dental practices, acting as an AI employee to handle tasks typically performed by human reception staff. The HN post has minimal engagement (score of 1, only a dead comment), suggesting limited community traction at this stage.
Probabilistic AI Agents Need Deterministic Gates. MCP Is How You Build Them.
Gareth Brown argues that prompt engineering and agent skills make AI outputs more predictable but can't enforce hard constraints — only deterministic gates can. Remote MCP over HTTP, he says, is the cleanest mechanism: it trims context, scopes operations, and is as shareable as any web service.
Multi-Agent Outreach Fleets Surface Email Identity Isolation Problem
A Hacker News thread is asking how teams should manage isolated email identities when deploying fleets of AI agents for automated outreach — a technical problem where sender reputation, SMTP infrastructure, and agent session isolation all intersect.
Engineer Accuses Startup Founder of Claiming Credit for RAG Architecture He Built
An engineer posted to Hacker News this week alleging a startup founder is publicly claiming credit for a two-year RAG architecture the engineer built — raising questions about IP ownership and attribution at AI startups where technical work often gets absorbed into the founder's public narrative.