Anthropic re-ran Project Fetch, the experiment where employees who are not robotics experts try to operate an off-the-shelf robotic quadruped. This time Claude Opus 4.7 ran the tasks with no human in the loop and was about 20 times faster than the fastest human team managed a year ago.
The Frontier Red Team frames a recurring arc: first models help humans, then humans help models, then models largely do things themselves. Anthropic says it watched the same pattern play out in cybersecurity. Last August, the older Opus 4.1 could not even work out how to connect to the robot on its own.
The caveat matters as much as the headline. The latest models still botch the literally physical part, precisely moving a beach ball, and none of the tasks touch low-level actuation policy. So this is agents wiring up and driving existing robot software far faster, not solving robotic control. The trend line, not the beach ball, is the story.