Back to Projects

GovChime Analytics Platform

Government Contracts Intelligence

Overview

Architected AI pipeline for government contract data: automated data sanitization, contract opportunity matching, and description generation via LLM APIs. Structured output validation ensures consistent data quality across 70M+ rows. Full agentic development workflow with Claude Code, custom agent skills, and MCP integrations.

Key Features

  • AI pipeline for automated data sanitization across 70M+ government contract records
  • LLM-powered contract opportunity matching and description generation
  • Structured output validation ensuring consistent data quality at scale
  • Full agentic development workflow with Claude Code and custom agent skills
  • MCP integrations for AI-augmented database queries and code generation
  • TDD-driven AI development ensuring reliable outputs

Tech Stack

AI & Dev Tools

OpenAI APIClaude CodeMCPPlaywright

Database & OLAP

PostgreSQLClickHouseMaterialized ViewsOLAP

Backend

Express.jsNode.jsTypeScriptREST APIStripe

Infrastructure

DockerKomodoGitHub ActionsCloudflare PagesSelf-Hosted Runner

Frontend

Next.jsReactTypeScriptTailwind CSSCloudflare Workers

Challenges & Solutions

Data Quality at Scale

Problem

Raw SamgovAPI data contained inconsistencies, missing fields and unstructured descriptions making it difficult to search and match contracts.

Solution

Built AI pipeline using LLM APIs for automated data sanitization, opportunity matching and description generation. Structured output validation ensures consistent data quality.

Slow Analytics Queries on 70M+ Rows

Problem

Real-time aggregation queries across 70M+ rows with complex JOINs took seconds, making dashboards unusable for end users.

Solution

Designed ClickHouse OLAP integration alongside PostgreSQL with 50+ materialized views for common aggregations. Query times reduced by 100-300% — dashboards became instant.

Multi-Service CI/CD for Sole Engineer

Problem

7 packages with interdependent builds and deploys needed reliable CI/CD without a dedicated DevOps team. Frontend ISR depends on backend being live, services must deploy atomically.

Solution

Architected 24+ GitHub Actions workflows on a self-hosted runner with dynamic port allocation, Komodo HTTP API for Docker orchestration, and a Build → Verify → Deploy pipeline ensuring frontend validates against temp backend before any production deploy.

Key Achievements

AI Pipeline
Automated data sanitization, matching, and description generation
70M+ Rows
AI processing at scale with structured output validation
Claude Code
Full agentic dev workflow with MCP integrations
TDD-Driven AI
Reliable AI outputs via test-driven development