Working with Claude on Real Projects: From Prompt to Pull Request

The AI-in-development hype says you'll write 80% of your code with prompts. The reality of senior devs who use it well is more nuanced: AI adds value when you know exactly which step of the workflow to plug it into — and how to verify what it produces.

This post is the workflow that actually works. Not the one being sold — the one being used.

The most common mistake: prompt → copy → commit

The workflow that doesn't work:

Write a big prompt describing the entire feature.
Copy the generated code.
Commit.

The problem isn't the AI — it's the context size and lack of verification. A prompt describing a complete feature produces code that works in isolation but doesn't fit the rest of the project: different naming, abstractions that don't align, uncovered edge cases, dependencies you'd already solved a different way.

The workflow that works is granular and verified.

Step 1: planning — Claude as a rubber duck with memory

Before writing code, use Claude to think through the problem out loud. Not to get the solution — to get the right questions.

"I need to implement a push notification system for mobile users.
Stack is Spring Boot + PostgreSQL + Firebase. What questions should
I answer before writing a single line of code?"

The answer gives you a decision checklist: sync or queued notifications? What happens if Firebase fails? How do you handle multiple devices per user? Retry policy?

You answer those questions yourself. With that context, the code you generate afterward will be much more coherent.

Step 2: generate by blocks, not by full feature

The rule: one prompt = one unit of code you can read and verify in 5 minutes.

Instead of:

"Implement the complete push notification system"

You do:

"Implement the DeviceToken entity with these fields: userId, token, platform 
(ANDROID|IOS), createdAt, lastUsedAt. JPA + PostgreSQL. The project uses 
Hibernate and our BaseEntity already has id and timestamps — extend from it."

Each block has a bounded scope, is verified quickly, and integrates with real project context.

Natural order:

Entities and DTOs
Repositories and interfaces
Service logic
Controller / endpoint
Tests

Never in reverse — without entities, the service will invent the data structure.

Step 3: give real context, don't describe context

The most frequent prompt mistake is describing code instead of showing it. Describing generates interpretations. Showing generates consistency.

Instead of:

"The project uses repositories that extend JpaRepository"

You paste:

// This is our repository pattern:
public interface UserRepository extends JpaRepository<User, Long> {
    Optional<User> findByEmail(String email);
    List<User> findByStatusAndCreatedAtAfter(UserStatus status, Instant since);
}

// Implement DeviceTokenRepository following this exact same pattern.

Claude will replicate the exact style: naming, types, method conventions. It won't invent something different.

Step 4: iterative review — the diff is your tool

Each generated block goes through the same process before being integrated:

Read the full diff. Not "scan it" — read it. Actively looking for errors, not validating that it "looks fine."

Verify the contract. Are method names, return types, and parameters consistent with what the rest of the code will call?

Look for the unrequested. Did the LLM add something you didn't ask for? Every unsolicited addition is a risk.

Compile and run existing tests. If something broke, understand why before patching it.

If there's anything in the diff you don't fully understand, don't merge it yet. Ask Claude to explain that specific fragment.

Step 5: tests — describe first, generate after

AI-generated tests without guidance are shallow: happy path, mocks that don't verify what matters, numerical coverage without real coverage.

The workflow that works:

First, you define the cases to cover:

"For DeviceTokenService.register(), I need tests for:
1. Successful registration of a new device
2. Update of lastUsedAt if the token already exists
3. Rejection if userId doesn't exist in the database
4. Correct handling when Firebase returns a 401 error"

Then you ask for the implementation of those specific tests. The AI implements — you defined what matters.

Result: tests with real semantics, not empty coverage.

Step 6: documentation — the one place where 100% AI is fine

Docstrings, class comments, module READMEs, OpenAPI endpoint descriptions: this is where AI adds value without risk. The code is already verified — the documentation describes it.

"Generate the Javadoc for these methods. The audience is another team 
developer who doesn't know this module. Include edge cases in @throws."

You don't review the code — you verify that the documentation accurately describes what the code does. That's verifiable in seconds.

The final PR

A well-assembled PR with this workflow has:

Atomic commits per block (entity, repository, service, controller, tests).
Each commit passed tests before moving to the next.
The PR description explains design decisions, not just what the code does.
No mixed changes — the PR does one thing.

The difference from a "copy everything" generated PR: the reviewer can read commit by commit, understand the reasoning, and ask specific questions. They don't have to digest 800 lines of opaque context.

What AI doesn't replace

Understanding the domain: AI generates code for the domain you describe. If the description is vague, the code is vague.
Design decisions: when to use events vs synchronous calls, how to model the domain, what goes in which layer. That requires understanding the complete system.
Reading the diff with judgment: if you don't know what to look for in the diff, you can't detect the error.

AI is a speed tool for the dev who already knows what they're building. It's not a replacement for knowing what to build.

Conclusion

The workflow that works isn't "big prompt → copy → commit." It's granular, verified, and contextualized: planning first, generation by blocks, real context in prompts, review of every diff, tests with semantics defined by the dev.

With that workflow, AI genuinely accelerates. Without it, it introduces debt you pay double later.