Guardian Mode
Runtime protection with semantic validation, injection detection, PII filtering, and action validation. Learn more →
The ArtemisKit SDK (@artemiskit/sdk) provides programmatic access to LLM evaluation, testing, and Guardian Mode for runtime protection.
npm install @artemiskit/sdkOr with other package managers:
# Bun (recommended)bun add @artemiskit/sdk
# pnpmpnpm add @artemiskit/sdk
# Yarnyarn add @artemiskit/sdkGuardian Mode
Runtime protection with semantic validation, injection detection, PII filtering, and action validation. Learn more →
Evaluation API
Programmatic LLM evaluation with all CLI evaluators available in code. Learn more →
Scenario Builders
Type-safe fluent API for building scenarios programmatically without YAML. Learn more →
Test Integration
Jest and Vitest matchers for LLM testing in your test suites. Learn more →
Validation & Comparison
Pre-flight scenario validation and regression detection between runs. Learn more →
Agentic Adapters
Test LangChain chains/agents and DeepAgents multi-agent systems. Learn more →
Protect your LLM applications from prompt injection, jailbreaks, and unauthorized actions.
import { createGuardian } from '@artemiskit/sdk';import { createAdapter } from '@artemiskit/core';
// Create your LLM clientconst client = await createAdapter({ provider: 'openai', apiKey: process.env.OPENAI_API_KEY,});
// Create guardian with protection settingsconst guardian = createGuardian({ mode: 'selective', // 'observe' | 'selective' | 'strict' validateInput: true, validateOutput: true, contentValidation: { strategy: 'semantic', // LLM-as-judge validation (new in 0.3.3) semanticThreshold: 0.9, },});
// Wrap your client with guardian protectionconst protectedClient = guardian.protect(client);
// Now all requests go through Guardianconst result = await protectedClient.generate({ prompt: 'What is the capital of France?', maxTokens: 100,});Validate scenarios before execution for pre-flight checks:
import { ArtemisKit } from '@artemiskit/sdk';
const kit = new ArtemisKit({ project: 'my-project' });
// Validate scenario filesconst validation = await kit.validate({ scenario: './scenarios/**/*.yaml', strict: true,});
if (!validation.valid) { console.error('Validation errors:', validation.errors); process.exit(1);}Compare runs to detect regressions:
const comparison = await kit.compare({ baseline: 'baseline-run-id', current: 'current-run-id', threshold: 0.05,});
if (comparison.regression) { console.error(`Regression: ${comparison.delta.passRate}% drop`);}Run programmatic evaluations with full access to all evaluator types.
import { ArtemisKit } from '@artemiskit/sdk';
const kit = new ArtemisKit({ provider: 'openai', model: 'gpt-4o', project: 'my-project',});
// Run scenario-based evaluationconst results = await kit.run({ scenario: './scenarios/quality-tests.yaml',});
console.log(`Pass rate: ${results.manifest.metrics.pass_rate * 100}%`);
// Red team security testingconst redteamResults = await kit.redteam({ scenario: './scenarios/my-app.yaml', mutations: ['typo', 'role-spoof', 'encoding'], countPerCase: 5,});
// Stress testingconst stressResults = await kit.stress({ scenario: './scenarios/load-test.yaml', concurrency: 10, duration: 60,});import { ArtemisKit } from '@artemiskit/sdk';import { jestMatchers } from '@artemiskit/sdk/jest';
// Extend Jest with ArtemisKit matchersexpect.extend(jestMatchers);
describe('My LLM App', () => { let kit: ArtemisKit;
beforeAll(() => { kit = new ArtemisKit({ provider: 'openai', model: 'gpt-4o-mini', project: 'jest-tests', }); });
it('should pass all test cases', async () => { const results = await kit.run({ scenario: './scenarios/quality.yaml', });
expect(results).toPassAllCases(); });
it('should achieve 90% success rate', async () => { const results = await kit.run({ scenario: './scenarios/quality.yaml', });
expect(results).toHaveSuccessRate(0.9); });
it('should pass red team testing', async () => { const results = await kit.redteam({ scenario: './scenarios/quality.yaml', mutations: ['typo', 'role-spoof'], });
expect(results).toPassRedTeam(); expect(results).toHaveNoCriticalVulnerabilities(); });});import { ArtemisKit } from '@artemiskit/sdk';import { vitestMatchers } from '@artemiskit/sdk/vitest';import { beforeAll, describe, expect, test } from 'vitest';
// Extend Vitest with ArtemisKit matchersexpect.extend(vitestMatchers);
describe('My LLM App', () => { let kit: ArtemisKit;
beforeAll(() => { kit = new ArtemisKit({ provider: 'openai', model: 'gpt-4o-mini', project: 'vitest-tests', }); });
test('should pass all cases', async () => { const results = await kit.run({ scenario: './scenarios/quality.yaml', });
expect(results).toPassAllCases(); }, 60_000);
test('should have acceptable latency', async () => { const results = await kit.run({ scenario: './scenarios/quality.yaml', });
expect(results).toHaveMedianLatencyBelow(5000); expect(results).toHaveP95LatencyBelow(10000); }, 60_000);});| Matcher | Description |
|---|---|
toPassAllCases() | All test cases passed |
toHaveSuccessRate(rate) | Achieve minimum success rate (0-1) |
toPassCasesWithTag(tag) | All cases with tag passed |
toHaveMedianLatencyBelow(ms) | Median latency under threshold |
toHaveP95LatencyBelow(ms) | P95 latency under threshold |
| Matcher | Description |
|---|---|
toPassRedTeam() | No vulnerabilities found |
toHaveDefenseRate(rate) | Achieve minimum defense rate (0-1) |
toHaveNoCriticalVulnerabilities() | No critical severity issues |
toHaveNoHighSeverityVulnerabilities() | No high severity issues |
| Matcher | Description |
|---|---|
toPassStressTest() | Stress test passed |
toHaveStressSuccessRate(rate) | Achieve minimum success rate under load |
toAchieveRPS(rps) | Achieve minimum requests per second |
toHaveStressP95LatencyBelow(ms) | P95 latency under threshold |