March 9, 2026·5 min read

Agent Builders Are Changing How I Ship Code — Here's My Actual Workflow

claude-codetoolsarchitecturecareerai|

Six months ago I was maintaining a 7-package monorepo alone — Express.js API, Next.js frontend, SLED admin panel, 3 SmartSync microservices, CI/CD across 24+ GitHub Actions workflows. The kind of system that normally needs a team.

I'm still the sole engineer. But the way I work has changed completely.

The Shift: From AI Assistant to AI Agent Team

Using an LLM as an autocomplete tool is table stakes at this point. What's different now is treating AI as a team of specialists — each agent scoped to a domain, pre-loaded with context about your architecture, and capable of completing multi-step tasks autonomously.

Claude Code makes this concrete. Instead of a generic chat interface, you define agents and skills as markdown files that Claude loads when invoked. The agent knows your codebase conventions, your file structure, your patterns — before you type a single word.

What I Actually Built

For GovChime, I have a set of custom agents covering the domains I work in most:

dut_ai-generation-expert — knows the full LLM generation pipeline: BullMQ queue, SSE events, Zod schemas, all 7 generation stages. When I'm debugging a generation issue, this agent has full context without me re-explaining the architecture. dut_devops-orchestrator — understands our GitHub Actions pipeline, Komodo deployment setup, and Railway services. I describe a deployment change; it writes the workflow YAML and explains the rollout impact. dut_design-review — runs automated design reviews using Playwright, checking responsive behaviour, contrast ratios, and component consistency against our style guide. Custom skills like /create-project and /create-blog-post — slash commands that follow a defined multi-step process: gather context, ask clarifying questions, generate output, verify the build passes.

The TDD Loop That Makes It Work

Raw AI-generated code is only as good as the validation layer you put around it. My workflow is deliberately structured:

Write the test spec first — before asking Claude to implement anything, I write what the correct behaviour looks like. This is non-negotiable.
Implement with the agent — the agent writes code against the spec, referencing the existing codebase patterns via MCP integrations (filesystem, database, GitHub).
Run tests, iterate — if tests fail, feed the output back into the agent. Most issues resolve in 1-2 iterations.
Review the diff — I read every diff before it merges. The agent isn't autonomous in production; I'm still the engineer making the final call.

This loop consistently produces code I'm comfortable shipping. The agent handles the boilerplate and pattern matching; I handle the architectural judgment.

MCP: The Part Most Developers Skip

Model Context Protocol lets Claude agents connect to live systems — not just static files. At GovChime, I use MCP integrations so agents can query the actual database schema, read live GitHub Actions status, and inspect the running Railway deployment.

The difference between an agent that reads a schema file and one that can run SELECT * FROM information_schema.tables is significant. The live connection means the agent's suggestions are grounded in current reality, not a cached snapshot from when you last updated your docs.

What This Looks Like in Practice

A recent example: I needed to add a new materialized view to ClickHouse for a dashboard feature, update the TypeScript types, add the API endpoint, and wire it to the frontend. That's work that would have taken half a day of context switching.

With the devops + backend agents and TDD loop: I wrote the test for the expected API response, described the feature to the agent, reviewed 3 iterations of generated code, and shipped in under 2 hours — including tests passing in CI.

Key Takeaways

Agents work because of context, not capability. A generic AI assistant is useful. An agent pre-loaded with your architecture, conventions, and domain knowledge is a force multiplier.
TDD is what makes AI-generated code trustworthy. Without tests you wrote first, you can't verify the agent understood the requirement correctly.
MCP integrations close the feedback loop. Agents connected to live systems give better suggestions than ones reasoning from static documentation.
You're still the engineer. The goal isn't to remove judgment from the process — it's to eliminate the parts that don't require judgment, so you can apply it where it matters.

I'm building this workflow on every project now — at GovChime and on my own products. If you're a solo engineer or small team trying to move faster without cutting corners, let's talk about how to set this up.

Oleksandr Yusypenko

Senior Full-Stack + AI Engineer. Building in public — AI agents, LangGraph, production systems.

Book a free consultation

Jun 9, 2026

Three boundaries, one source of truth: sharing types across Rust and TypeScript

I almost unified my Rust backend and TypeScript frontend under one protobuf schema. The better answer was matching the tool to each boundary — tied together by one codegen rule.

Mar 9, 2026

Why I Moved AI Out of NestJS and Into a Dedicated Python LangGraph Service

Our NestJS AI chain hit a wall — unreliable outputs, no observability, zero crash recovery. Here's how I replaced it with a 5-node LangGraph StateGraph in Python FastAPI, and why splitting AI into its own service was the right architectural call.