Asking vs Delegating AI Agents 🧐

dev.to

Most developers use AI like a smarter Stack Overflow.

Type a question. Get an answer. Go do the work yourself.

That's fine but it's the slow way 😩 There's a faster mode, and most people haven't switched to it yet.

Diff: Asking & Delegating

When you ask an AI:

"How do I write tests for my auth module?"

You get a nice explanation. Then you write the tests yourself. You're still doing the work πŸ₯Έ

When you delegate to an AI agent:

"Write tests for /src/auth.py. Cover login, logout, and invalid token cases. Run them. If any fail, fix the code until they pass. Tell me what you changed."

The agent opens your files, writesthe tests, runsthem, readsthe failures, fixes the code, and comes back to you with a working test suite.

You review the result. You didn't do the work.

That's the shift πŸ™‚β€β†”οΈ It sounds small. The time difference is huge.

How to write a good delegation

Every delegation that works has four parts. Think of it like giving a task to a new team member:

  • Goal: what should it produce?
  • Scope: which files or area of the codebase?
  • Success condition: how do we know it's done correctly?
  • Report back: tell me what you changed and why.

Here's what that looks like in practice:

Debugging:

"Here's the error and the stack trace. Find the root cause, fix it, and explain what was broken."

Why this works: You're not asking what the error means. You're handing over the whole problem, find it, fixit, explain it 😎

Refactoring:

"Refactor this file. Max two levels of nesting. No single function longer than 30 lines. Update every call site in the codebase."

Why this works: The constraints are clear and checkable. The agent knows exactly when it's done 🧐

Database migration:

"Write a migration script for this schema change. Make it idempotent. Run it against a local test database and confirm it succeeds."

Why this works: You gave it a way to verify its own work before coming back to you πŸ€”

PR review:

"Read this PR diff. Find anything that could fail in production. Write the tests I missed."

Why this works: Two goals, both specific. The agent can do both end to end πŸ₯΅

Data pipeline:

"This pipeline has no tests. Write a test suite that checks: schema consistency at each step, no data leakage between train and validation sets, and correct handling of null values."

Why this works: A boring job that always gets skipped. Now it's a 5-minute task πŸ₯±

Before you open your IDE:

"Here's the Jira ticket. Find the relevant files in the repo. Give me a summary of what needs to change before I write a single line."

Why this works: You start coding with context instead of spending 20 minutes doing archaeology πŸ₯±

How to check what the agent gives you

Agents are fast. They're also wrong sometimes, not randomly, but in predictable ways. Here's how to catch it without spending 40 minutes re-reviewing everything.

Check 1: Did it actually solve the problem?

Run the code. Don't just read it.

Agents can write code that looks completely correct and fails in a specific edge case. The only way to find out is to execute it. If there are tests, run them. If there aren't, run the thing manually. Or let the agent write the test code.

Reading code feels like reviewing. Actually running it is reviewing.

Check 2: Does it fit your codebase?

The agent doesn't know your team's conventions. It doesn't know why that module is structured that way, or that you decided last quarter never to use that pattern again. Technically correct code can still be wrong for your situation.

Scan the output for anything that feels off, an unusual pattern, a library you don't use, an approach that doesn't match how the rest of the codebase works. That's the thing only you can catch.

Check 3: Did it change anything you didn't ask about?

Check which files it touched. Read the diff like you'd read a PR from a junior developer.

Agents sometimes make helpful changes, cleaning up something nearby, refactoring a function that wasn't in scope. Sometimes that's great. Sometimes it breaks something else. Know what it touched before you accept it.

The evaluation matrix

The mindset shift in one sentence

What to check The question to ask Red flag
Goal Did it actually solve the problem? Tests pass but logic is wrong
Fit Would your team merge this as-is? Right code, wrong conventions
Scope Did it only touch what you asked? Changed files you didn't mention

That review takes 5–10 minutes on most tasks. Not 40.

Your job changes from doing the work to defining the goal and reviewing the result.

You bring judgment and context. The agent brings speed and tirelessness.

What do you think? πŸ€” Drop it in the comments

Source: dev.to

arrow_back Back to News