The QA process can be still painful as you described. I advice everyone to try out TDD. In non-trivial projects, I instruct AI to create automated tests between planning and development. The rule is: only continue once I reviewed the tests.
After implementation it has to run all tests and can only continue once they pass.
If the work involves UI in any way, I also instruct it to use Chrome MCP (or Playwright) to verify visuals. This usually requires a frontier model for good enough results.
Finally, the task can't be marked as complete without my explicit approval. I had good success rates with these QA methods.
The 2 hours of not coding before winning a hackathon is the perfect illustration of this. I've been applying the same thinking to research workflows. Spent time wiring up three search APIs into a single skill so every future session fans out to all three in parallel instead of one. The prep compounds. Covered the setup here: https://reading.sh/how-to-build-a-solid-research-pipeline-in-claude-code-ff7878c5e2b5
The 2-out-of-5-hours prep ratio lines up with what I've seen. Context fluency is a good name for it — I've been calling it "front-loading context" but your framing is better. The payoff is real: agents that start with proper specs and constraints produce code you can actually ship instead of code you have to rewrite.
If you were to frame your thought process at this mise en place initial phase, and describe the steps you follow for pre-feeding the process, that would be helpful.
The QA process can be still painful as you described. I advice everyone to try out TDD. In non-trivial projects, I instruct AI to create automated tests between planning and development. The rule is: only continue once I reviewed the tests.
After implementation it has to run all tests and can only continue once they pass.
If the work involves UI in any way, I also instruct it to use Chrome MCP (or Playwright) to verify visuals. This usually requires a frontier model for good enough results.
Finally, the task can't be marked as complete without my explicit approval. I had good success rates with these QA methods.
Great post!
The 2 hours of not coding before winning a hackathon is the perfect illustration of this. I've been applying the same thinking to research workflows. Spent time wiring up three search APIs into a single skill so every future session fans out to all three in parallel instead of one. The prep compounds. Covered the setup here: https://reading.sh/how-to-build-a-solid-research-pipeline-in-claude-code-ff7878c5e2b5
The 2-out-of-5-hours prep ratio lines up with what I've seen. Context fluency is a good name for it — I've been calling it "front-loading context" but your framing is better. The payoff is real: agents that start with proper specs and constraints produce code you can actually ship instead of code you have to rewrite.
If you were to frame your thought process at this mise en place initial phase, and describe the steps you follow for pre-feeding the process, that would be helpful.