Notes from building with AI. What actually happens, not what the demos show.
My website has a chat that answers questions about my career. I never checked whether its answers were any good. So I built an eval pipeline. It caught the AI failing in four different ways, and it caught me once too.
A production website in one weekend, built with Claude Code. The honest version of what worked, what broke, and why every problem was a human problem.