❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️

Human-in-the-Loop — design patterns and implementation guidance

Human-in-the-Loop — design patterns and implementation guidance

Human-in-the-Loop — when and how to involve people

Bringing a human reviewer into AI orchestrations improves safety, correctness, and trust. Humans can be introduced as an explicit approval gate, a remediation step for warnings/fails, or a quality-sampling reviewer. This page covers two main integration patterns and practical implementation guidance, followed by suggestions and trade-offs.

Two integration patterns

  1. Agent-mediated human-in-the-loop (Human-Agent)
  • A dedicated Human Agent acts as a mediation layer. The orchestrator delegates a review task to the Human Agent using the A2A protocol. The Human Agent presents a UI (web/Teams/email) to the reviewer, collects an action (approve/reject/edit), and returns structured results to the orchestrator.
  1. Orchestrator-mediated human-in-the-loop (Orchestrator-Direct)
  • The orchestrator itself creates a human-review task, notifies the reviewer (push notification, Teams, or email), and waits for a response via callback or polling. No separate agent is required; the orchestrator handles the lifecycle.

Both patterns are valid — choose based on architecture, scaling, and separation-of-concerns.

When to call a human

  • Validation record status = fail (block downstream side-effects).
  • Validation status = warn with high-impact trigger (regulatory, finance, legal).
  • High-risk triggers (e.g., send-to-all, release notes, policy changes).
  • Manual sampling for QA and ongoing model calibration.

Implementation guide — common building blocks

  • Task envelope: extend your A2A envelope/control metadata with: trigger_id, review_type, review_deadline, reviewer_role, callback_url, escalate_after_seconds.
  • Idempotency: all side-effectful operations must include idempotency_key so retries or double-submits are safe.
  • Audit trail: persist the entire envelope, producer result, validation record, reviewer decisions, and timestamps in an immutable audit store.
  • Escalation: define automatic escalation rules (no response -> escalate to manager or queue -> block after N hours).
  • UI: reviewers need the summary, cited evidence (snippets + links), confidence scores, and suggested remediation actions (approve, request-edit, redact, escalate).
  • Minimal UI elements: accept/approve, request-changes (text), annotate snippet (redact/highlight), re-run (invoke producer again), and final decision.
  • Credentials & access: reviewers must be authenticated (SSO) and authorized; the orchestrator must validate that the reviewer had the same or broader access as the original requester.

Agent-mediated (Human Agent) — steps & sample flow

Pros

  • Clear separation of concerns: agents remain small and focused.
  • Reuse: the Human Agent can provide consistent UI/UX across orchestrations.
  • Easier to scale the review channel separately.
  • Can offer richer collaboration features (comments, threaded discussion).

Cons

  • Additional service to deploy and secure.
  • Extra network hop and complexity in tracing.

Flow

  1. Orchestrator composes the review envelope with producer result + validation record + trigger meta.
  2. Orchestrator delegates to human-agent via A2A (invoke endpoint) with response_mode: async and callback_url set.
  3. Human Agent displays UI, waits for reviewer action.
  4. Reviewer chooses action; Human Agent posts decision to callback_url or calls the orchestrator’s resume endpoint.
  5. Orchestrator resumes workflow and records the decision.

Orchestrator-mediated (Direct) — steps & sample flow

Pros

  • Fewer moving parts and easier to reason about.
  • Lower deployment surface (no separate Human Agent service).
  • Potentially lower latency for simple reviews.

Cons

  • The orchestrator grows larger and more complex.
  • Harder to reuse review UI across domains.
  • Scaling review workflows may be harder inside a monolith.

Flow

  1. Orchestrator stores producer result + evidence and creates a review_task record (DB/queue).
  2. Orchestrator notifies reviewers (Teams message, email, in-app) containing a secure review link.
  3. Review UI (a small service or server-side page) fetches the review task and posts decision to orchestrator’s resume endpoint.
  4. Orchestrator resumes workflow based on decision.
{
  "task_id": "t-789",
  "correlation_id": "c-999",
  "producer": {"agent_id":"knowledge-agent","result_id":"r-34"},
  "validation_id": "val-123",
  "trigger_id": "summarize-project",
  "review_type": "approval",
  "review_deadline": "2025-11-05T12:00:00Z",
  "reviewer_role": "manager",
  "callback_url": "https://orchestrator/api/review/resume",
  "ui_hint": {"show_snippets": true, "highlight_terms": ["decision","deadline"]}
}

C# orchestrator sketch — create review task and wait for callback

// Simplified: Create a review task and wait for a callback via TaskCompletionSource keyed by task_id.
public class ReviewTaskRecord { public string TaskId; public string CorrelationId; public string CallbackUrl; /*...*/ }

private static readonly ConcurrentDictionary<string, TaskCompletionSource<ReviewDecision>> _pending = new();

public async Task<ReviewDecision> RequestHumanReviewAsync(ProducerResult result, ValidationRecord vr, string[] reviewers, TimeSpan timeout)
{
    var taskId = Guid.NewGuid().ToString();
    var reviewRecord = new ReviewTaskRecord { TaskId = taskId, CorrelationId = vr.CorrelationId, CallbackUrl = "https://orchestrator/api/review/resume" };
    // persist reviewRecord to DB

    // notify reviewers (Teams/email) with secure link that contains taskId and short-lived token
    NotifyReviewers(reviewers, taskId);

    var tcs = new TaskCompletionSource<ReviewDecision>(TaskCreationOptions.RunContinuationsAsynchronously);
    _pending[taskId] = tcs;

    using (var cts = new CancellationTokenSource(timeout))
    {
        using (cts.Token.Register(() => tcs.TrySetCanceled()))
        {
            try { return await tcs.Task.ConfigureAwait(false); }
            finally { _pending.Remove(taskId, out _); }
        }
    }
}

// Resume endpoint called by UI/HumanAgent
[HttpPost("/api/review/resume")]
public IActionResult ResumeReview([FromBody] ReviewCallback payload)
{
    // validate token, reviewer identity, permissions
    if (_pending.TryGetValue(payload.TaskId, out var tcs))
    {
        tcs.TrySetResult(payload.Decision);
        return Ok();
    }
    // if not pending, persist decision and handle async/late responses (audit)
    PersistDecision(payload);
    return Accepted();
}

Notes

  • Use a durable queue or DB for tasks so late responses can be processed if in-memory TCS expired.
  • Use short-lived tokens in review links to ensure only authorized reviewers can access a task.
  • Consider implementing an audit-only mode where decisions are logged but not blocking.

Implementation areas & pros/cons (summary table)

  • Latency

    • Agent-mediated: higher latency (extra hop) but supports richer collaboration.
    • Orchestrator-direct: lower latency for basic flows.
  • Complexity & maintenance

    • Agent-mediated: +service to operate and secure.
    • Orchestrator-direct: +complexity in orchestrator codebase.
  • Reuse & UX

    • Agent-mediated: reusable review UI for multiple orchestrations.
    • Orchestrator-direct: UI tends to be bespoke per orchestration unless centralized.
  • Security

    • Both require secure tokens and authorization; agent-mediated can centralize reviewer auth flows.
  • Scalability

    • Agent-mediated: can scale independently (worker pools, autoscale).
    • Orchestrator-direct: can become a bottleneck if orchestrator handles large review volumes.
  • Observability & auditing

    • Agent-mediated: human agent can provide rich audit UI and collaboration artifacts.
    • Orchestrator-direct: must build these features into orchestrator or a separate review UI service.

UX suggestions

  • Provide contextual evidence: snippets, original links, validation scores, and exact claims highlighted.
  • Allow inline edits or suggestions (not just binary approve/reject).
  • Record reviewer rationale (free text) for audits.
  • Show history and previous validations for the same document/claim.
  • Support bulk-approve for low-risk tasks to reduce reviewer fatigue.

Security & compliance

  • Enforce least-privilege tokens for the review UI. The UI should not fetch sources with broader scopes than original request.
  • Redact high-sensitivity fields by default; provide privileged reviewers with an explicit “show redacted” flow that is audited.
  • Keep evidence bundle tamper-evident: store SHA256 hashes and signatures.

Metrics to track

  • Time-to-review (p50/p95)
  • Review pass/warn/fail rates
  • Number of re-runs requested by reviewers
  • Reviewer workload and queue length
  • Actions blocked by human decision (prevented side-effects)

Quick checklist before production

  • Design review UI and secure it with SSO + RBAC
  • Define task envelope and callback contract
  • Add idempotency keys and audit trail for every side-effect
  • Decide agent-mediated vs orchestrator-direct based on expected scale and reuse
  • Implement escalation & SLA rules
  • Add metrics and alerts for stalled reviews

Human-in-the-loop flow (diagram)

The diagram below shows where a human reviewer is inserted, how the review task is created and resumed, and how approval or changes influence the next step of the orchestration.

Action Agents

Human Review Channel

Validation

Producers

query + retrieve

answer + citations

validate evidence

ValidationRecord: pass/warn/fail

create review task

approve / request changes / reject

on approve

call Graph/API

on changes

on reject

pass

warn/fail

Knowledge Agent

Validation Agent

Human Reviewer

Outlook Agent

User Request

Orchestrator

External System

Validation status?

If you prefer a sequence view of the same flow:

External SystemOutlook AgentHuman ReviewerValidation AgentKnowledge AgentOrchestratorUserExternal SystemOutlook AgentHuman ReviewerValidation AgentKnowledge AgentOrchestratorUseropt[edit]alt[status == pass][status == warn/fail]Request (intent)Retrieve + SummarizeAnswer + CitationsValidate(answer, evidence)ValidationRecord {status}Compose/Send/InviteAPI CallsResultConfirmationReview Task (approve/reject/edit)DecisionRe-run or request changesFinal response + audit id