Developer uses Claude Code to autonomously port 2000 lines of ARM64 assembly to x86-64

Developer Matt Keeter has published a detailed account of using Anthropic's Claude Code to autonomously produce a first-draft x86-64 assembly backend for raven-uxn, his high-performance emulator of the Uxn fictional CPU. The task involved porting approximately 2,000 lines of hand-written ARM64 assembly — work Keeter had originally deferred as "a challenge for someone else" after completing the ARM64 backend in late 2024. When no one took him up on it, he turned to Claude Code as an experiment in AI-assisted low-level systems programming, running the agent on a disposable Oxide Computer VM.

The agent operated largely without human intervention across three distinct phases: getting the codebase to compile by resolving assembly syntax issues, iteratively fixing failing unit tests opcode by opcode, and then running a fuzz harness to catch correctness bugs that unit tests missed. Keeter's direct involvement amounted to roughly 15 to 20 minutes of hands-on time over a few hours of wall-clock work — primarily spent noticing when the agent was waiting for permission to execute a new command. The total cost came to approximately $29 on an enterprise plan. The agent autonomously generated GDB scripts to probe interpreter state during debugging, behavior Keeter described as "debugging like a goldfish with logorrhea," based on its verbose internal reasoning traces.

The quality of the generated assembly was, by Keeter's assessment, middling. The agent confused caller and callee register save conventions, over-relied on the eax register, and avoided 8-bit and 16-bit operations without clear reason. Despite these deficiencies, the implementation passed both unit tests and fuzz testing — giving Keeter a working foundation to refine. After human cleanup, the x86-64 backend achieved roughly a 2 to 2.5 times speedup over the Rust implementation. That result matters for raven-uxn's intended deployment targets: x86-64 Linux servers and Oxide Computer hardware, where an ARM64-only backend was a practical obstacle.

Keeter attributes the agent's success to two structural factors: a comprehensive test suite and fuzz harness that enabled automated feedback loops, and the translation-flavored nature of the task, where a complete ARM64 reference implementation made the work more tractable than writing from a high-level specification. His post is one of the more technically grounded accounts published of where agentic coding tools add real value — getting a bounded, well-tested problem from nothing to a working draft — and where human expertise is still required to clean up the result.