100 Hours of Vibecoding: The Real Gap Between Prototype and Production

Mac Budkowski spent 100 hours building Cryptosaurus, a Farcaster mini-app that generates dinosaur-styled NFT profile pictures using AI image generation, and then documented every place the work piled up. The post-mortem ran on his Kanfa newsletter in March 2026 as a direct response to viral "built an app in 30 minutes with AI" claims. Budkowski isn't a skeptic — he was an early Claude adopter and built an initial Cryptosaurus prototype in under an hour using Claude in Plan Mode. The remaining 99 hours are the story.

UI and UX work through LLM prompting was slow. Budkowski notes that Figma would have been faster for the interface work. The bigger time sink was image generation: getting consistent, usable results across a wide range of real-world profile pictures took more than 200 prompt iterations across Claude, Gemini, and OpenAI's Codex. The final output was a 274-line prompt.ts file plus a supplementary Gemini script with programmatic guardrails to catch edge cases that prompt instructions alone couldn't constrain. No single model produced acceptable results across all inputs, creating an orchestration problem that prototype demos never surface.

Infrastructure consumed more time that demos skip entirely. Budkowski set up AWS S3 and Lambda, deployed an NFT smart contract to Base Mainnet with onlyMinter protections, and secured key management through a Safe multisig. Two days went into stress-testing concurrent usage limits. None of it caught the nonce bug that broke concurrent payment transactions on launch day — it only appeared under real load, despite multiple rounds of LLM-assisted testing beforehand.

There's a structural asymmetry buried in the project: code generation models produce inspectable outputs you can patch and test. Generative image APIs don't. No regression harness, no deterministic control, no structured composition constraints. That made the image pipeline a different class of engineering problem than the code — harder to automate, harder to validate, and invisible in the prototype.

Budkowski's bottom line: AI still delivered a 10 to 100 times speed improvement over building from scratch without it. He also deliberately avoided platforms like Replit and Lovable to work through infrastructure complexity directly. The harder question his project surfaces is whether that gap closes anytime soon. Until image generation APIs offer the same testability as code — deterministic outputs, regression testing, structured constraints — multi-modal pipelines will keep carrying production costs that 30-minute demos never show.