TypeScript SDK
AgentV provides two npm packages for programmatic use:
@agentv/eval— custom assertions and code judges@agentv/core— programmatic evaluation API and typed configuration
Installation
Section titled “Installation”# Assertion SDK (defineAssertion, defineCodeJudge)npm install @agentv/eval
# Programmatic API (evaluate, defineConfig)npm install @agentv/coreCustom Assertions
Section titled “Custom Assertions”Use defineAssertion from @agentv/eval to create reusable assertion types. Place them in .agentv/assertions/ — they’re auto-discovered by filename.
Pass/Fail Pattern
Section titled “Pass/Fail Pattern”import { defineAssertion } from '@agentv/eval';
export default defineAssertion(({ answer }) => { const wordCount = answer.trim().split(/\s+/).length; return { pass: wordCount >= 3, reasoning: `Output has ${wordCount} words`, };});Score Pattern
Section titled “Score Pattern”Return a score (0–1) instead of pass for graded evaluation:
import { defineAssertion } from '@agentv/eval';
export default defineAssertion(({ answer, trace }) => ({ pass: answer.length > 0 && (trace?.eventCount ?? 0) <= 10, reasoning: 'Checks content exists and is efficient',}));If only pass is given, score is 1 (pass) or 0 (fail).
Using in YAML
Section titled “Using in YAML”Convention-based discovery maps filename → assertion type:
.agentv/assertions/word-count.ts → type: word-count.agentv/assertions/sentiment.ts → type: sentimentReference directly in your eval file — no command: needed:
assert: - type: word-count - type: contains value: "Hello"Code Judges
Section titled “Code Judges”Use defineCodeJudge from @agentv/eval for full control over scoring with explicit hits/misses:
import { defineCodeJudge } from '@agentv/eval';
export default defineCodeJudge(({ trace, answer }) => ({ score: trace?.eventCount <= 5 ? 1.0 : 0.5, hits: ['Efficient tool usage'], misses: [],}));defineCodeJudge judges are referenced in YAML with type: code_judge and command: [bun, run, judge.ts]. defineAssertion uses convention-based discovery instead — just place in .agentv/assertions/ and reference by name.
For detailed patterns, input/output contracts, and language-agnostic examples, see Code Judges.
Programmatic API
Section titled “Programmatic API”Use evaluate() from @agentv/core to run evaluations as a library — no YAML needed.
Inline Test Definitions
Section titled “Inline Test Definitions”import { evaluate } from '@agentv/core';
const { results, summary } = await evaluate({ tests: [ { id: 'greeting', input: 'Say hello', assert: [{ type: 'contains', value: 'Hello' }], }, ],});
console.log(`${summary.passed}/${summary.total} passed`);Auto-discovers the default target from .agentv/targets.yaml and .env credentials.
File-Based via specFile
Section titled “File-Based via specFile”Point to an existing YAML eval instead of inlining tests:
import { evaluate } from '@agentv/core';
const { results, summary } = await evaluate({ specFile: './evals/my-eval.eval.yaml',});Typed Configuration
Section titled “Typed Configuration”Create agentv.config.ts at your project root for type-safe, validated configuration using defineConfig() from @agentv/core:
import { defineConfig } from '@agentv/core';
export default defineConfig({ execution: { workers: 5, maxRetries: 2 }, output: { format: 'jsonl', dir: './results' }, limits: { maxCostUsd: 10.0 },});The config file is auto-discovered by the CLI from your project root and validated with Zod at startup.
Scaffold Commands
Section titled “Scaffold Commands”Bootstrap new assertions and eval files from the CLI:
# Create a new assertion typeagentv create assertion <name> # → .agentv/assertions/<name>.ts
# Create a new eval with test casesagentv create eval <name> # → evals/<name>.eval.yaml + .cases.jsonl