March 9, 2026·5 min read

Why I Moved AI Out of NestJS and Into a Dedicated Python LangGraph Service

ailangchainarchitecturenestjs|

When your AI pipeline lives inside your main backend, everything feels fine — until it doesn't. I hit that wall on MealPlan AI, an AI-powered meal planning platform where NestJS handled both business logic and LLM orchestration.

The AI SDK chain was a black box. No observability into token costs. No crash recovery mid-generation. No way to validate LLM outputs before persisting them. When a 7-day meal plan generation failed on day 5, the entire thing restarted from scratch.

Something had to change.

The Architecture Split

I separated the system into two services with clear responsibilities:

NestJS — business logic, auth, payments, job orchestration via BullMQ
Python FastAPI — all LLM orchestration, validation, and AI-specific tooling

The flow is straightforward: BullMQ worker in NestJS fires an HTTP POST to the Python service, which streams SSE events back. NestJS persists each completed day incrementally and forwards events to the frontend.

Why Python? LangGraph, LangChain, and the broader AI ecosystem are Python-first. Fighting that with TypeScript wrappers added complexity without adding value.

The 5-Node LangGraph StateGraph

The core of the Python service is a LangGraph StateGraph with five nodes:

prepare_context — loads dietary restrictions, participant profiles, calorie targets, and retrieves relevant recipes via RAG (2.2M recipes from RecipeNLG, 80K foods from USDA)
generate_day — calls the LLM with structured prompts including diversity history and calorie distribution targets
validate_day — two-layer validation: programmatic restriction checking first, optional LLM self-validation second
emit_day — streams a DAY_COMPLETED SSE event so NestJS can persist immediately via JSONB atomic append
update_history — tracks dishes, ingredients, and cuisines to enforce diversity across the full meal plan

Each node passes typed state to the next. If generation fails on day 5, PostgreSQL AsyncPostgresSaver checkpointing lets us resume from exactly where we stopped — no wasted LLM calls.

Type Safety Across Languages

A dual-language service creates a type drift risk. I solved this with a one-directional pipeline: Zod schemas (TypeScript) export to JSON Schema, which generates Pydantic models (Python). A CI workflow runs on every PR to catch drift before it reaches production.

What This Unlocked

Splitting AI into its own service wasn't just a refactor — it enabled features that would have been painful to build in the monolithic setup:

Langfuse observability — every LLM call traced with token counts, costs, and latency. Self-hosted, full control over data.
Incremental persistence — each day saves immediately. A PARTIALLY_COMPLETED status lets users see progress and resume interrupted plans.
Granular regeneration — separate endpoints for regenerating a single day (full graph) or a single meal (direct LLM call) with user feedback injected into prompts.
RAG retrieval — full-text search against RecipeNLG and USDA datasets with pgvector fallback for semantic search.
329 Python tests — pytest-asyncio covering every node, validator, and edge case independently from the NestJS test suite.

Key Takeaways

Separate AI from business logic early. The longer you wait, the harder the extraction. AI services have different scaling, testing, and deployment needs.
Use LangGraph for multi-step AI workflows. A linear chain breaks down when you need validation loops, conditional retries, and state management across steps.
Invest in type contracts across languages. Zod-to-JSON-Schema-to-Pydantic catches bugs at build time that would otherwise surface as silent data corruption.
Stream incrementally, persist incrementally. Users shouldn't wait for a 30-second generation to complete before seeing anything. SSE + atomic JSONB appends make this straightforward.

The meal planning platform is live at meal-plan.app. If you're evaluating AI service architecture for your own product, I'm happy to talk through the tradeoffs — book a call.

Oleksandr Yusypenko

Senior Full-Stack + AI Engineer. Building in public — AI agents, LangGraph, production systems.

Book a free consultation

Jun 9, 2026

Three boundaries, one source of truth: sharing types across Rust and TypeScript

I almost unified my Rust backend and TypeScript frontend under one protobuf schema. The better answer was matching the tool to each boundary — tied together by one codegen rule.

Apr 8, 2026

Building an AI Content Engine for a Gov Contracting Platform

Government contracting is jargon-heavy and the content gap is huge. Here's the four-layer AI content engine I built for GovChime: a research agent, rubric scoring, an approval queue, and SEO-ready Next.js publishing.

Mar 9, 2026

Agent Builders Are Changing How I Ship Code — Here's My Actual Workflow

I've been shipping production features as a solo engineer on a complex multi-service codebase. The secret isn't working harder — it's building custom AI agents that know my architecture. Here's the exact setup I use with Claude Code.

The Architecture Split

The 5-Node LangGraph StateGraph

Type Safety Across Languages

What This Unlocked

Key Takeaways

Related Posts