✦ Open Source LLM-Agnostic Apache 2.0

Weave Perfect Test Cases
from Your Requirements

TestLoom uses AI to generate comprehensive, traceable test suites from user stories, API specs, and acceptance criteria — with zero vendor lock-in.

100+
LLM Providers
0
Vendor Lock-in
Faster Coverage
70%
Effort Reduction
Scroll to explore
The Problem

Testing slows down every team

Manual test case authoring is the bottleneck nobody talks about — until the release slips.

01

Manual & Time-Consuming

QA engineers spend 35–50% of their sprint writing test cases by hand, re-reading requirements, and formatting specifications — work AI can do in seconds.

~40%
of SDLC effort is testing overhead
02

Coverage Gaps & Rework

Without systematic generation, edge cases are missed, requirements are misinterpreted, and defects slip to production — triggering expensive re-test cycles.

3–5×
iteration rounds due to missed edge cases
03

Vendor Lock-in & Cost Risk

Existing AI testing tools tie organisations to a single LLM provider with opaque pricing — forcing full pipeline rebuilds when providers change policies or prices.

60%+
of enterprise test suites remain fully manual
Architecture

Open. Composable. LLM-Agnostic.

Every layer is swappable. Every provider is plug-and-play. No proprietary dependencies.

INPUT LAYER 📄 Requirements 🔗 API Specs 🎫 Jira / Confluence 📊 Excel / PDF ⚙️ CLI TESTLOOM CORE ENGINE PROMPT ENGINE Jinja2 Templates YAML Prompt Library Few-Shot Examples Context Injection Version Controlled GENERATORS RequirementGenerator RAG Generator Multi-Agent Review (Phase 3) LangGraph Agents ChromaDB / pgvector LLM GATEWAY (LiteLLM) Provider-agnostic abstraction · 100+ models Model Routing Cost Tracking Fallback Audit Logging PII Masking Retry Governance Guardrails · Human-in-the-Loop Approval LLM PROVIDERS (PLUGGABLE — SWAP VIA CONFIG) 🤖 OpenAI 🔮 Anthropic 🦙 Ollama (Local) ☁️ Azure OpenAI 🔥 AWS Bedrock OUTPUT & INTEGRATIONS 📋 JSON / Markdown / CSV 🔌 REST API (Phase 4) ⚙️ Jenkins / GitHub Actions 📊 Jira / TestRail
Features

Everything you need. Nothing you don't.

Built for QA engineers, SDETs, and test architects who refuse to compromise on quality or flexibility.

🔀

LLM-Agnostic Gateway

Swap between OpenAI, Anthropic, Ollama, or Azure purely via config. Zero code changes required — ever. Powered by LiteLLM.

🧠

Smart Prompt Engineering

Jinja2 + YAML prompt templates are version-controlled, team-editable, and support few-shot examples and context injection.

🔍

RAG-Powered Context

ChromaDB-backed vector search injects relevant historical test cases and project context into every generation request. Progressively smarter with every run.

🤖

Multi-Agent Review Mesh

A review mesh of specialised LangGraph agents checks coverage, traceability, edge cases, and language quality before output. (Phase 3)

🛡️

Governance & Audit

Full audit trails, PII masking, human-in-the-loop approval gates, and per-request cost tracking built directly into the pipeline.

CI/CD First-Class

Native Jenkins and GitHub Actions integration. Generate test cases on every PR or sprint — part of your existing workflow.

How It Works

From requirement to test suite in seconds

Four steps. Any LLM. Any output format.

1

Ingest

Feed in requirements, user stories, API specs, or Jira tickets via CLI, Python SDK, or REST API.

2

Analyse

The Prompt Engine constructs a rich, context-aware prompt from your input using YAML templates and optional RAG context.

3

Generate

The LLM Gateway routes the request to your chosen provider and returns structured, traceable test cases.

4

Publish

Test suites are exported as JSON, Markdown, or CSV — ready for Jira, TestRail, or your CI pipeline.

Quick Start

Up and running in 2 minutes

Works with any LLM — even a free local Ollama model.

# Install TestLoom pip install testloom # With RAG support (ChromaDB + embeddings) pip install "testloom[rag]" # Verify testloom version
# testloom.yaml — place in your project root llm: provider: openai # openai | anthropic | ollama | azure model: gpt-4o # any model string LiteLLM supports api_key: ${OPENAI_API_KEY} # or set TESTLOOM_LLM__API_KEY env var generation: max_cases_per_request: 20 include_negative_cases: true include_edge_cases: true include_test_data: true
# Generate from a requirements file testloom generate -i requirements.md -f markdown # Inline text — specify test types and extra context testloom generate --text "Users can filter products by category and price range" \ --types functional,negative,boundary \ --context "Filters are client-side; max 500 products per page" \ --max-cases 12 # Batch-generate across many requirements concurrently testloom batch requirements.md --output-dir output/ -f junit --concurrency 3 # Use a free local Ollama model (zero cost) testloom generate -i story.md --model ollama/llama3
import asyncio from testloom import Settings, GenerationRequest, RequirementGenerator from testloom.gateway.registry import GatewayRegistry settings = Settings.load("testloom.yaml") gateway = GatewayRegistry.create(settings.llm) gen = RequirementGenerator(gateway, settings) req = GenerationRequest( requirement_text="Users can filter products by category and price range", max_cases=15, ) suite = asyncio.run(gen.generate(req)) for tc in suite.test_cases: print(f"[{tc.test_type.value}] {tc.title}")
Providers

Any LLM. Your choice.

Powered by LiteLLM — one interface for 100+ models. Switch providers in a single line of config.

🤖
OpenAI
GPT-4o, GPT-4o-mini
🔮
Anthropic
Claude Sonnet & Opus
🦙
Ollama
Local · Zero cost
☁️
Azure OpenAI
Enterprise VNet
🔥
AWS Bedrock
Claude & Llama on AWS
🌐
Any OpenAI-compat
Custom api_base URL
Roadmap

Where we're going

Six focused phases — from open-source scaffold to enterprise-grade AI testing platform.

Phase 0
Foundation

✅ Scaffold & Core Abstractions

Project structure, CI/CD pipeline, LLM gateway abstraction, Pydantic domain models, Docker Compose, GitHub Actions.

Phase 1
Generation

✅ Basic Test Generation — Complete

RequirementGenerator, Jinja2 Prompt Engine, CLI with Rich output, JSON/Markdown/CSV/JUnit formatters, retry logic, structured logging.

Phase 2
RAG

🔄 Context-Aware Generation — Complete

RAGGenerator, ChromaDB ContextStore (living test knowledge base), sentence-transformers embeddings, PDF/MD/TXT input processors, batch generation with concurrency control.

Phase 3
Review

📋 Multi-Agent Review Mesh

Quality scoring agents, coverage analysis, governance guardrails, human-in-the-loop approval workflow, audit trails.

Phase 4
API & UI

📋 REST API & Web Interface

FastAPI server, web dashboard, Jenkins plugin, GitHub Actions integration, Jira and TestRail direct push.

Phase 5
Enterprise

📋 Enterprise Scale

Observability layer, prompt lineage tracking, SSO, multi-tenant support, plugin marketplace.

Ready to weave better tests?

Open source. No lock-in. Up and running in 2 minutes.