Core Concepts

ArtemisKit is built around a few key concepts that apply across both the CLI and SDK. Understanding these will help you get the most out of the toolkit.

Scenarios

Test suites containing prompts and expectations for evaluating LLM outputs. Learn more →

Expectations

Matchers that define how to evaluate LLM responses against expected behavior. Learn more →

Providers

LLM provider configurations for OpenAI, Anthropic, Azure, and more. Learn more →

Evaluators

The evaluation engine that runs scenarios and produces results. Learn more →

How It All Fits Together

┌─────────────────────────────────────────────────────────────┐
│                        ArtemisKit                           │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐     │
│  │  Scenarios  │───▶│  Evaluators │───▶│   Results   │     │
│  │  (YAML/TS)  │    │  (Matchers) │    │  (Reports)  │     │
│  └─────────────┘    └─────────────┘    └─────────────┘     │
│         │                  │                  │             │
│         ▼                  ▼                  ▼             │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐     │
│  │   Provider  │    │  Guardian   │    │   Storage   │     │
│  │  (LLM API)  │    │ (Protection)│    │  (History)  │     │
│  └─────────────┘    └─────────────┘    └─────────────┘     │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Typical Workflow

Define scenarios — Write test cases in YAML files or TypeScript using builders
Configure providers — Set up API keys and model settings
Run evaluations — Execute scenarios via CLI (akit run) or SDK (kit.run())
Review results — Analyze pass/fail rates, latency, and detailed responses
Iterate — Refine prompts, adjust expectations, improve your LLM application

Shared Concepts

These concepts work the same way whether you’re using the CLI or SDK:

Concept	CLI	SDK
Scenarios	YAML files	YAML files or `scenario()` builder
Expectations	YAML `expected` field	YAML or `contains()`, `exact()`, etc.
Providers	Config file or `--provider` flag	`ArtemisKit({ provider: '...' })`
Results	Terminal output + reports	`RunResult` object
Storage	`artemis-output/` directory	Configurable via `storage` option

Next Steps

Scenarios — Learn the scenario format in depth
Expectations — Explore all expectation types
Getting Started — Build your first test in 5 minutes