Reading What the Agent Will Do — It Doesn't Always Understand
Yes, I was that guy thinking do as you think best would work. It doesn't. Here are the five plan mode claude habits that actually save me debugging time.
Yes, I was that guy thinking that just telling what I want and saying "do as you think best" would obviouse work. It didn't. By the end of week two I realyzed the plan mode claude shows you on screen is the only thing standing between you and a four-hour debugging session.
What plan mode claude actually shows you
Plan mode claude is the mode where the agent drafts a full plan — file edits, new files, deletions, test changes — and waits for approval before it touches anything. I used to skim the plan, hit approve, walk to the kitchen. I came back to a working feature about a third of the time. The other two thirds, the agent had quietly added a third util file, a wrapper around an existing wrapper, and a "for now" mock that survived three commits.
In the first two weeks of agentic coding I learned the agent is not your peer. It is a very fast, very literal collaborator. It follows instructions LITERALLY. It defaults to ADDITIVE behavior — adding a new file is cheap, deleting one feels risky, so it adds. It pattern-matches against training data. It does not model author intent. Plan mode claude is how you read that contract before you sign it.
A short glossary
- Plan mode
The Claude Code mode that drafts a full plan — files to create, edit, delete; tests to run — and waits for your explicit approval before executing.
- Agentic coding
An LLM autonomously writing and editing code across files, calling shell + file tools without asking line-by-line.
- LLM
Large Language Model — the trained model that produces the agent's output. Claude is the LLM here.
- CLI
Command-Line Interface. Claude Code is a CLI; you talk to it in a terminal, not a chat window.
- Tool use
When the LLM calls a defined tool (read_file, edit_file, run_shell) instead of returning text. See the Anthropic tool use docs and anthropic-sdk-python.
- Prompt
The instruction text you send. In agentic coding the prompt is the contract — vague prompt, additive output.
Five plan mode claude habits I actually use
Each habit costs about ten seconds at the prompt. Each saves me, on a bad day, two hours of debugging. These five replaced "do as you think best" after I watched the same failure modes repeat in the first two weeks of agentic coding.
Without the habit
- Name the file you want edited → New util file appears next to the old one
- Forbid additive scope → Wrapper-around-wrapper, three new helpers
- Specify the deletion → Old code stays as "deprecated, kept for compatibility"
- Pin the test contract → New tests added; failing tests skipped
- Name the library version → Pattern-matched against an older API
With the habit
- Name the file you want edited → Edit lands in the right file
- Forbid additive scope → One small change in place
- Specify the deletion → Old code actually gone
- Pin the test contract → Existing tests pass; new tests prove the change
- Name the library version → Code uses the API you actually have
Habit 1 — Name the file. Always.
Before:
add a debounce util to the search box
After:
edit src/components/search/SearchBox.tsx — replace the inline onChange handler with a 200ms debounce. do not create a new util file. use the existing useDebouncedValue hook in src/lib/hooks/useDebouncedValue.ts.
100% of the time the agent added a new src/lib/utils/debounce.ts when I left the file unnamed — even when a perfectly fine debounce hook was three folders away. The agent does not crawl the repo for prior art unless you tell it where to look.
Habit 2 — Forbid additive scope, on the prompt, in writing
do NOT add new files. do NOT add new exports. do NOT add a wrapper.
edit only the files I name. if you cannot do it without adding a file,
stop and tell me.
The "stop and tell me" line matters. Without an explicit stop condition, the agent routes around the constraint by inventing a "minimal helper" that grows over three turns. With it, plan mode claude returns a plan that says "I cannot complete this without adding a file" — and now you have a real conversation, not a surprise commit.
Habit 3 — Specify the deletion, line by line
after the new code is in place, DELETE the old handler in
src/components/search/SearchBox.tsx lines 42–58. delete the import
on line 4. the file should be 17 lines shorter when you are done.
Deletion is the most-skipped instruction in agentic coding. The agent will leave a // TODO: remove after migration comment that lives forever. Naming the line range and expected line count makes the deletion auditable in the plan diff before you approve.
Habit 4 — Pin the test contract
tests/search.test.ts must still pass unchanged.
do not add new tests. do not skip existing tests.
if a test fails after your edit, that is a bug in your edit, not
the test.
This block has saved me more debugging time than any other habit. Without it, the agent's reflex is to add three new tests and quietly mark a failing existing test as .skip because "it tests deprecated behavior." With it, the test contract is the gate.
Habit 5 — Name the library version
this project uses Next 16 (app router, react 19, server components by default).
do not use getServerSideProps. do not use pages router conventions.
if you are unsure which API version applies, stop and ask.
The model's training data contains every version of every library that ever shipped. Without a version pin, the agent pattern-matches against the most common training-data pattern, which is rarely the version you actually have. This habit fixed about half of my "why is it importing something that does not exist" moments.
Three mid-execution interventions
Plan mode claude reads the plan before execution. These three read what is happening during execution. Use them in the order below.
Stop on first surprise
If the running narration mentions a file you did not name, hit cancel. The cost of letting an unwanted edit land and reverting it is ten times the cost of stopping at the first sign of drift.
Ask for the diff before the commit
Between the edit and the test run, prompt: "show me the full unified diff of every file you changed, no commentary." Read it. The diff is the source of truth.
Force a single-file rollback when something feels off
"Revert your last edit to
src/components/search/SearchBox.tsxusinggit checkout. Do not edit any other file." The explicit single-file scope stops the agent from "fixing" the rollback.
Failure modes you will recognize
Three I see every day. Once you can name them, plan mode claude becomes readable in about thirty seconds per plan.
Premature abstraction
The agent extracts a one-line helper into a new file because helpers feel professional. The helper has one caller. Plan reads: "create
src/lib/format-search-query.ts." That is your cue to say "inline it."Generic fallback
The agent cannot find the exact API you asked for and substitutes something that compiles. The plan looks reasonable; the runtime explodes. The tell is a vague verb: "handle the response." Handle how.
Over-confident hedging
The agent writes "this should work" three times in one plan. Three hedges means the model is not confident, and a non-confident plan is the one most likely to ship the wrong thing. Treat hedging as a stop signal.
The mental model
The agent is the world's fastest junior engineer who has read every codebase ever written and has zero memory of yours. Every prompt is the entire onboarding. Plan mode claude is the pull-request review you do before the PR exists. The five habits turn vague prompts into reviewable contracts; the three interventions turn an opaque execution into a readable one.
References
- Anthropic — Tool use overview. https://docs.anthropic.com/claude/docs/tool-use
- Anthropic —
anthropic-sdk-python(tool use examples + reference). https://github.com/anthropics/anthropic-sdk-python
Vadim Sharapov is the founder of Loomaru — revenue recovery infrastructure for Shopify stores. If your ad platforms can't see 5–15% of your conversions, loomaru.com.
Want to know what your store's gap looks like, and what closing it would do to monthly revenue?