Most engineers write better specs for a junior dev than they do for the most capable code generator they've ever used. They'll spend twenty minutes crafting a Jira ticket with acceptance criteria, edge cases, and links to prior art - then turn around and type "refactor this function" into an LLM and wonder why the output is mid.

Prompting isn't a new skill. It's an old one you're already bad at.

The patterns that produce good results from language models are the same patterns that produce good results from any intelligent collaborator working without full context: clear intent, explicit scope, defined constraints, and a healthy distrust of confident-sounding answers. If you've ever written a design doc or briefed a contractor, you already know this discipline. You're just not applying it.

Lead With Why

"Refactor the token validation." Six words, zero context. Should the model optimize for readability or performance? Should it preserve the existing interface, improve it, handle edge cases the current code ignores, or all of the above?

Without why, every judgment call becomes a coin flip.

Start with the problem. "Our token validation is scattered across three middleware functions, which makes it impossible to update the expiry logic without touching all three. Consolidate it." Now the model has intent. It can make tradeoffs you didn't explicitly list, and it'll make them in the right direction.

Context is leverage. A sentence of motivation buys you paragraphs of better output.

Explore Before You Build

Your first prompt should be "understand X," not "build X."

A model that hasn't read the existing code will reinvent what's already there. It'll introduce a helper function that duplicates one you wrote six months ago; it'll pick a pattern that conflicts with your team's conventions; it'll solve the problem correctly in a codebase that doesn't exist.

Point the model at what exists first. "Read through the auth module and summarize the current token flow before proposing changes." Yes, this takes longer on the first prompt. The alternative is three rounds of correction because the model built something greenfield inside a brownfield project.

Pay the upfront cost. Context compounds; assumptions decay.

Scope Cuts Both Ways

Engineers are decent at saying what they want. They're terrible at saying what they don't want.

"Refactor the token validation" leaves the entire codebase as fair game. The model might restructure your session handling because it thought it was helping. It might rename exports that other modules depend on. Scope without boundaries is an invitation for well-intentioned damage.

Define both edges. "Refactor the token validation in auth/tokens.ts. Don't modify the session middleware or change any exported interfaces." Now the model has a fence.

This extends to anti-patterns. If there's an obvious-but-wrong approach the model might reach for, name it and close it in the prompt. "Do not introduce a global token cache - we tried that, it caused stale reads under concurrent refresh." You know the landmines in your codebase. The model doesn't. Spell them out or watch it step on one.

State What You Know

"Explain how connection pooling works in Postgres" gets you a primer that starts with what a connection is. "I run Postgres in production with PgBouncer in transaction mode - how should I handle prepared statements?" gets you the answer you actually need.

That's the same question at two different expertise levels, and the response surface changes completely.

Every unstated assumption is a coin flip on whether the model lands on the same one you did. Your experience level; your deployment context; your team's conventions - these aren't incidental details. They're constraints that shape what a useful answer looks like.

"I'm a senior engineer, skip the basics" isn't arrogance. It's a format spec that eliminates filler between you and the nuance you came for.

Specify the Shape

You know what you want the output to look like. The model doesn't, and it will guess.

Language, types, style, length - one sentence of format specification saves three rounds of follow-up. "Return TypeScript, use the existing AuthToken interface, keep the function pure." That's twelve words that prevent a response in JavaScript with inline side effects.

For non-negotiables, borrow from RFC conventions. MUST and MUST NOT aren't just emphatic - they're semantically precise. "The function MUST return a Result type. It MUST NOT throw exceptions." The model treats these differently than suggestions. So should you.

Ever read an IETF RFC? There's a reason that format has survived decades of ambiguity-hostile engineering. Steal from it.

Trust Nothing Clean

Last week I asked a model to design a retry mechanism for a webhook delivery system. The response was clean: exponential backoff, max attempts, dead letter queue. Looked production-ready.

It assumed idempotent receivers. Nowhere in my prompt or its response did anyone mention what happens when the receiver processes the same webhook twice with different side effects. The clean answer buried the hard problem.

LLMs optimize for plausible. The gaps live in what wasn't said: the race condition under concurrent writes; the edge case when the input is empty; the implicit assumption that the caller already validated.

When an answer feels too neat, ask what's missing. "What assumptions did you make?" is the single highest-value follow-up in your toolkit. It forces the model to surface the tradeoffs it buried under a clean interface.

You wouldn't ship a PR that passed zero tests. Don't ship an LLM answer you haven't stress-tested.

Iterate, Then Audit

Don't restart the conversation. The model holds context - use it.

When the output is wrong, say what's wrong and what to change. "The error handling is swallowing the original stack trace. Preserve it and re-throw with additional context." This is cheaper and more precise than re-prompting from scratch, because the model already has your intent, your constraints, and your codebase context loaded.

Once you've iterated to something that looks right, run one more pass. Ask the model to review its own output against your original constraints. "Check this implementation against the requirements I listed in my first message. Flag anything that's missing or inconsistent." Self-review isn't perfect, but it's cheap - and cheap verification catches expensive mistakes before they hit production.

Anchor to Your Codebase

A model without project context writes code for a project that doesn't exist.

Point it at your conventions before asking it to produce anything. Your CONTRIBUTING.md; your architecture docs; your existing module structure - these are the constraints that make generated code fit instead of float. "Read the project README and follow the patterns in the existing handlers" is the difference between code you can merge and code you have to rewrite.

This matters most after long sessions. Models lose the thread. When context compresses or resets, re-anchor. "Re-read the project conventions before continuing" costs five seconds and prevents drift that costs five rounds of correction.


If you take one thing from this: treat prompting with the same rigor you'd treat a spec for a contractor who's talented, fast, and has never seen your codebase.

  1. State the problem before the task.
  2. Show the model what exists before asking it to build.
  3. Define scope in both directions - what to touch and what not to.
  4. Name the anti-patterns you've already tried.
  5. Specify output shape like you're writing a contract.
  6. Distrust clean answers to messy problems.
  7. Iterate in context, then audit against your own constraints.

The model is exactly as good as the brief you give it. Write the brief you'd be embarrassed to hand a colleague, and you'll get the output you deserve. 🫡