
Agents can run multiple rounds of review and verification before you look. But ship to humans without looking yourself, and something subtle always feels off — because fully specifying a system that feels right to a human is hard.
Some interesting thoughts in Noah Smith's piece, "Your future job will be to keep AI on task."
I've had a lot of luck letting different agents do multiple rounds of review and verification before I look at things — but I've regretted it when I don't look at things at all before other humans use the system I'm working on, even with clear acceptance criteria checked off explicitly.
There's always something in the details that is a little off, mostly because it's hard to completely specify a system that feels right to a human, at least while humans are the target user. There's lots of software where agents are the target user and that's a different story. Even then, the agents are still working on behalf of something humans want.
Get More Like This
Follow along as I build and share what I learn
Found this helpful? Share it with your network!