Notizen aus der Praxis mit KI. Was wirklich passiert, nicht was die Demos zeigen.
My website has a chat that answers questions about my career. I never checked whether its answers were any good. So I built an eval pipeline. It caught the AI failing in four different ways, and it caught me once too.
A production website in one weekend, built with Claude Code. The honest version of what worked, what broke, and why every problem was a human problem.