Production-Ready LLM Optimization

Closed-loop system for building and automatically optimizing LangChain RAG applications. Prompt optimization with GEPA. Workflow optimization with MEGA.

Get Started Learn More

Three-Layer Architecture

Build

LangChain

Retrieval + Generation

Measure

MLflow

Tracing + Metrics

Optimize

GEPA + MEGA

Prompts + Workflows

Repeat → Continuous Improvement

Core Features

GEPA Optimization

Evolutionary prompt optimization using LLM reflection. Generates prompt variants, tests against your evaluation set, and keeps the Pareto frontier of best candidates.

MEGA Workflow

Optimize workflow structure and routing decisions. Tests different tool selection strategies, retrieval intensity, and agent decision logic.

MLflow Integration

Automatic tracing of all executions. Track metrics, versions, and parameters. Full observability into what your system is doing.

Shared Eval Harness

Single evaluation set drives both GEPA and MEGA optimization. Ensure consistency across all optimization layers.

Groq Integration

Fast, cost-effective inference with Groq. Perfect for iterative optimization loops. Easy to swap with OpenAI, Anthropic, or any LLM.

Production Ready

Clean architecture, comprehensive documentation, and battle-tested patterns. Deploy to production with confidence.

Two Optimization Layers

🎯 GEPA: What Your Agent Says

  • Optimize system prompts
  • Improve answer templates
  • Better tone & formatting
  • Reduce hallucinations
  • ~35× more efficient than RL

🔄 MEGA: How Your Agent Works

  • Optimize routing decisions
  • Tool selection & ordering
  • Retrieval strategy tuning
  • Block-level scoring
  • Workflow structure evolution

See It In Action

Terminal: Baseline RAG App

Baseline Application

Run your LangChain RAG app. Ask questions, observe behavior. This is what we'll optimize.

Terminal: GEPA Optimization

GEPA Prompt Optimization

Watch GEPA generate variants, test them, and converge on the best prompt. 14% improvement.

Terminal: MEGA Workflow Opt

MEGA Workflow Optimization

MEGA tests workflow variants and finds optimal routing. 25% improvement through better structure.

Metrics: Before → After

Results Comparison

Side-by-side metrics: Accuracy +17%, Groundedness +13%, Hallucinations -83%.

MLflow: Traces & Experiments

MLflow Dashboard

Full observability into traces, metrics, and prompt versions. Track everything.

Architecture: 3-Layer System

Production Architecture

Clean separation: Build, Measure, Optimize. Scales from prototypes to production.

Perfect For

Support Chatbots

Answer questions from knowledge bases. Improve accuracy and reduce hallucinations systematically.

Internal Assistants

HR, IT, compliance bots. Help employees find information quickly and accurately.

Developer Copilots

Code documentation assistants. Provide better suggestions through continuous optimization.

Tool-Using Agents

Complex workflows with retrieval. Optimize both what agents say and how they route.

Quick Start

Get Up and Running in 5 Minutes

# Clone the repo git clone https://github.com/saurabh-oss/gepa-langchain-lab cd gepa-langchain-lab # Setup python -m venv .venv && source .venv/bin/activate pip install -r requirements.txt cp .env.example .env # Add your GROQ_API_KEY to .env # Start MLflow UI mlflow ui --host 0.0.0.0 --port 5000 # Run the optimization pipeline python src/app.py # Baseline python src/eval.py # Evaluate python src/optimize.py # GEPA prompt optimization python src/optimize_mega.py # MEGA workflow optimization

That's it! Your system is now optimized. Both prompts and workflows have been automatically improved based on your evaluation set.

View Full Documentation

Ready to Optimize Your LLM App?

Fork the repository, customize for your use case, and ship a better system.

GitHub Repository Documentation