Validation Agent — prove results and verify correctness

Validation Agent — verify results and provide proofs of correctness

The Validation Agent is a specialized agent whose job is to verify intermediate outputs produced by other agents (Knowledge Agent, Outlook Agent, etc.), produce structured validation reports, and optionally provide a cryptographic proof bundle that documents the evidence used for the verification. The orchestrator should invoke the Validation Agent between steps (for example: after a knowledge search + summary and before composing or sending an email) to reduce hallucinations, enforce policy, and increase trust.

Responsibilities

Validate semantic outputs for correctness, grounding, and policy compliance.
Verify citations actually contain the quoted text/snippets and match source metadata (ETag, modified_at).
Run deterministic checks (schema conformity, numeric calculations, cross-check facts against authoritative sources).
Detect PII leakage and enforce redaction policies.
Produce a validation record with: status (pass/warn/fail), evidence bundle (citation hashes, snippet offsets), score/confidence, and remediation suggestions.
Optionally sign the validation record (HMAC or public-key signature) so it can be used in audits.

Where the Validation Agent fits (high-level flow)

User request arrives at orchestrator.
Orchestrator calls Knowledge Agent 1 search and summarization.
Orchestrator calls Validation Agent with the Knowledge Agent result + evidence (citations, snippets).
Validation Agent returns validated result (pass/warn/fail) and an evidence bundle.
If pass (or acceptable warn), orchestrator proceeds to the next agent (e.g., Outlook Agent to compose/send). If fail, orchestrator requests human review or re-runs components.

This mid-pipeline validation reduces downstream errors and prevents actions (like sending email) based on unverified content.

Minimal contract (inputs/outputs)

Inputs: envelope containing the producer result, evidence (list of citations with source identifiers and snippet text or offsets), context (trigger id, system prompt, user identity), and validation policy parameters (thresholds, checks to run).
Outputs: ValidationRecord shape:

type ValidationRecord = {
    status: 'pass' | 'warn' | 'fail'
    checks: Array<{ id: string; result: 'pass' | 'fail' | 'warn'; details?: string; score?: number }>
    evidence_bundle: { source_id: string; url: string; snippet_hash: string; snippet_text_sample?: string }
    signature?: string // HMAC/PK signature over the record
    remediation?: string // suggestions (re-run, escalate, redact)
}

Typical validation checks

Citation existance: check that each citation references a source and that the snippet exists at the claimed offset or matches expected text.
Grounding check: ensure summary claims are supported by one or more cited passages (measured by token overlap, semantic similarity, or exact match where applicable).
Freshness: verify the source’s modified_at or ETag to ensure citations are not stale.
ACL/permission check: confirm the requesting principal had access to queried sources at indexing/query time.
Numeric verification: re-run computations using an independent code path for claims containing numeric results (sums, averages, dates).
Schema validation: if the output is structured (JSON, entities), validate against a schema/JSON-Schema.
PII and policy checks: scan for high-sensitivity PII and enforce redaction rules.
Contradiction detection: detect pairs of statements that contradict known facts or each other (basic logical checks or LLM-based contradiction detector with conservative thresholds).
Confidence calibration: compute an aggregated confidence score using constituent check scores and thresholds configured by policy.

Evidence bundle & proof-of-correctness

The Validation Agent should produce an evidence bundle that contains the minimal data required for later auditors to re-run or verify the validation without re-querying the original sources. Typical items in the bundle:

Source references: {source_id, url, etag, modified_at}
Snippet samples: small excerpt (e.g., 200 chars) and snippet start/end offsets in the original document (if available)
Snippet hash: SHA256 of the snippet text
Validation metadata: which checks ran, their results, scores, and timestamps
Optional signature: HMAC-SHA256 or an RSA/ECDSA signature over a canonicalized JSON record

Example canonical validation record (JSON, minimized):

{
  "validation_id":"val-123",
  "correlation_id":"c-999",
  "status":"pass",
  "checks":[{"id":"citation-exists","result":"pass"},{"id":"grounding","result":"pass","score":0.87}],
  "evidence_bundle":[{"source_id":"drive:123","url":"https://...","etag":"W/\"abc\"","snippet_hash":"..."}],
  "signature":"..."
}

Store the evidence bundle in a secure immutable store (blob or object store with write-once policies) and keep an index/record in a DB for fast lookup (validation_id -> blob_url).

C# sketch: Validation Agent

This minimal sketch shows a validation call pattern and a simple grounding check (verify cited snippets equal the claimed sample and compute a simple overlap score). This is intentionally concise; real systems should use robust extraction and indexing to fetch the exact snippet.

using System;
using System.Net.Http.Json;
using System.Security.Cryptography;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

public record Citation(string SourceId, string Url, string Etag, string SnippetSample);
public record ProducerResult(string Summary, Citation[] Citations);

public record CheckResult(string Id, string Result, double Score, string Details);
public record ValidationRecord(string ValidationId, string CorrelationId, string Status, CheckResult[] Checks, object[] EvidenceBundle, string Signature = null);

public static class ValidationAgent
{
    private static readonly HttpClient _http = new();

    // Simplified: fetch snippet from source (in production use authenticated Graph or drive fetch)
    public static async Task<string> FetchSnippetAsync(Citation c)
    {
        // Placeholder: real fetch uses Graph/drive API and respects ACLs
        var resp = await _http.GetStringAsync(c.Url);
        // naive: find sample in resp
        return resp;
    }

    public static string HashSnippet(string s)
    {
        using var sha = SHA256.Create();
        var hash = sha.ComputeHash(Encoding.UTF8.GetBytes(s));
        return Convert.ToHexString(hash);
    }

    public static CheckResult CitationExistsCheck(Citation c)
    {
        // minimal local check; in practice this is async and fetches the actual content
        return new CheckResult("citation-exists", "pass", 1.0, "sample provided");
    }

    public static CheckResult GroundingCheck(string summary, Citation[] citations)
    {
        // naive overlap score: count words in summary that appear in snippet samples
        var words = summary.Split(' ', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries);
        int total = words.Length; int matched = 0;
        var text = string.Join(' ', citations.Select(c => c.SnippetSample ?? string.Empty));
        foreach (var w in words) if (text.Contains(w, StringComparison.OrdinalIgnoreCase)) matched++;
        var score = total == 0 ? 0.0 : (double)matched / total;
        return new CheckResult("grounding", score > 0.2 ? "pass" : "warn", score, $"matched {matched}/{total} words");
    }

    public static ValidationRecord ValidateProducerResult(ProducerResult r, string correlationId)
    {
        var checks = new List<CheckResult>();
        foreach (var c in r.Citations)
        {
            checks.Add(CitationExistsCheck(c));
        }
        checks.Add(GroundingCheck(r.Summary, r.Citations));

        var evidence = r.Citations.Select(c => new
        {
            c.SourceId,
            c.Url,
            c.Etag,
            snippet_hash = HashSnippet(c.SnippetSample ?? string.Empty)
        }).ToArray();

        var status = checks.Any(ch => ch.Result == "fail") ? "fail" : checks.Any(ch => ch.Result == "warn") ? "warn" : "pass";
        var record = new ValidationRecord(Guid.NewGuid().ToString(), correlationId, status, checks.ToArray(), evidence);

        // optional: sign the record (HMAC example)
        var key = Encoding.UTF8.GetBytes(Environment.GetEnvironmentVariable("VALIDATION_HMAC_KEY") ?? "dev-key");
        using var hmac = new HMACSHA256(key);
        var payload = JsonSerializer.SerializeToUtf8Bytes(record);
        var sig = Convert.ToHexString(hmac.ComputeHash(payload));
        record = record with { Signature = sig };

        // persist record to blob store + DB index (omitted)
        return record;
    }
}

Orchestrator integration pattern

Orchestrator calls producer agent (Knowledge Agent) -> gets result + citations.
Orchestrator calls Validation Agent with the result and policy parameters (e.g., minimum grounding score 0.6).
If ValidationRecord.Status == “pass” proceed; if “warn” present to human-in-loop or apply auto-mitigation (redact or attach disclaimers); if “fail” block downstream side-effects and raise an incident.
Persist validation_id in the orchestration audit trail for future inspection and replay.

Operational considerations

Performance: validation increases latency; run fast checks synchronously and heavier checks (deep re-checks, cross-source verification) asynchronously.
Cost: validation that refetches content or re-runs models has cost; apply sampling and canary strategies (only validate full output for high-risk triggers).
Security: validation must enforce ACLs and never fetch sources using a broader scope than the original request.
Reproducibility: include source ETags and snippet hashes in the evidence bundle so auditors can verify the same bytes were used.
Human-in-loop: provide an interface where a reviewer can see the validation record, evidence and approve/reject.

Metrics & SLAs

Validation pass rate, warn rate, fail rate by trigger.
Mean validation time and p50/p95 latencies.
Cost per validated request (tokens, fetches).
Number of blocked actions prevented.