Human-in-the-Loop — design patterns and implementation guidance

Human-in-the-Loop — when and how to involve people

Bringing a human reviewer into AI orchestrations improves safety, correctness, and trust. Humans can be introduced as an explicit approval gate, a remediation step for warnings/fails, or a quality-sampling reviewer. This page covers two main integration patterns and practical implementation guidance, followed by suggestions and trade-offs.

Two integration patterns

Agent-mediated human-in-the-loop (Human-Agent)

A dedicated Human Agent acts as a mediation layer. The orchestrator delegates a review task to the Human Agent using the A2A protocol. The Human Agent presents a UI (web/Teams/email) to the reviewer, collects an action (approve/reject/edit), and returns structured results to the orchestrator.

Orchestrator-mediated human-in-the-loop (Orchestrator-Direct)

The orchestrator itself creates a human-review task, notifies the reviewer (push notification, Teams, or email), and waits for a response via callback or polling. No separate agent is required; the orchestrator handles the lifecycle.

Both patterns are valid — choose based on architecture, scaling, and separation-of-concerns.

When to call a human

Validation record status = fail (block downstream side-effects).
Validation status = warn with high-impact trigger (regulatory, finance, legal).
High-risk triggers (e.g., send-to-all, release notes, policy changes).
Manual sampling for QA and ongoing model calibration.

Implementation guide — common building blocks

Task envelope: extend your A2A envelope/control metadata with: trigger_id, review_type, review_deadline, reviewer_role, callback_url, escalate_after_seconds.
Idempotency: all side-effectful operations must include idempotency_key so retries or double-submits are safe.
Audit trail: persist the entire envelope, producer result, validation record, reviewer decisions, and timestamps in an immutable audit store.
Escalation: define automatic escalation rules (no response -> escalate to manager or queue -> block after N hours).
UI: reviewers need the summary, cited evidence (snippets + links), confidence scores, and suggested remediation actions (approve, request-edit, redact, escalate).
Minimal UI elements: accept/approve, request-changes (text), annotate snippet (redact/highlight), re-run (invoke producer again), and final decision.
Credentials & access: reviewers must be authenticated (SSO) and authorized; the orchestrator must validate that the reviewer had the same or broader access as the original requester.

Agent-mediated (Human Agent) — steps & sample flow

Pros

Clear separation of concerns: agents remain small and focused.
Reuse: the Human Agent can provide consistent UI/UX across orchestrations.
Easier to scale the review channel separately.
Can offer richer collaboration features (comments, threaded discussion).

Cons

Additional service to deploy and secure.
Extra network hop and complexity in tracing.

Flow

Orchestrator composes the review envelope with producer result + validation record + trigger meta.
Orchestrator delegates to human-agent via A2A (invoke endpoint) with response_mode: async and callback_url set.
Human Agent displays UI, waits for reviewer action.
Reviewer chooses action; Human Agent posts decision to callback_url or calls the orchestrator’s resume endpoint.
Orchestrator resumes workflow and records the decision.

Orchestrator-mediated (Direct) — steps & sample flow

Pros

Fewer moving parts and easier to reason about.
Lower deployment surface (no separate Human Agent service).
Potentially lower latency for simple reviews.

Cons

The orchestrator grows larger and more complex.
Harder to reuse review UI across domains.
Scaling review workflows may be harder inside a monolith.

Flow

Orchestrator stores producer result + evidence and creates a review_task record (DB/queue).
Orchestrator notifies reviewers (Teams message, email, in-app) containing a secure review link.
Review UI (a small service or server-side page) fetches the review task and posts decision to orchestrator’s resume endpoint.
Orchestrator resumes workflow based on decision.

Recommended message/task shape for human review (JSON)

{
  "task_id": "t-789",
  "correlation_id": "c-999",
  "producer": {"agent_id":"knowledge-agent","result_id":"r-34"},
  "validation_id": "val-123",
  "trigger_id": "summarize-project",
  "review_type": "approval",
  "review_deadline": "2025-11-05T12:00:00Z",
  "reviewer_role": "manager",
  "callback_url": "https://orchestrator/api/review/resume",
  "ui_hint": {"show_snippets": true, "highlight_terms": ["decision","deadline"]}
}

C# orchestrator sketch — create review task and wait for callback

// Simplified: Create a review task and wait for a callback via TaskCompletionSource keyed by task_id.
public class ReviewTaskRecord { public string TaskId; public string CorrelationId; public string CallbackUrl; /*...*/ }

private static readonly ConcurrentDictionary<string, TaskCompletionSource<ReviewDecision>> _pending = new();

public async Task<ReviewDecision> RequestHumanReviewAsync(ProducerResult result, ValidationRecord vr, string[] reviewers, TimeSpan timeout)
{
    var taskId = Guid.NewGuid().ToString();
    var reviewRecord = new ReviewTaskRecord { TaskId = taskId, CorrelationId = vr.CorrelationId, CallbackUrl = "https://orchestrator/api/review/resume" };
    // persist reviewRecord to DB

    // notify reviewers (Teams/email) with secure link that contains taskId and short-lived token
    NotifyReviewers(reviewers, taskId);

    var tcs = new TaskCompletionSource<ReviewDecision>(TaskCreationOptions.RunContinuationsAsynchronously);
    _pending[taskId] = tcs;

    using (var cts = new CancellationTokenSource(timeout))
    {
        using (cts.Token.Register(() => tcs.TrySetCanceled()))
        {
            try { return await tcs.Task.ConfigureAwait(false); }
            finally { _pending.Remove(taskId, out _); }
        }
    }
}

// Resume endpoint called by UI/HumanAgent
[HttpPost("/api/review/resume")]
public IActionResult ResumeReview([FromBody] ReviewCallback payload)
{
    // validate token, reviewer identity, permissions
    if (_pending.TryGetValue(payload.TaskId, out var tcs))
    {
        tcs.TrySetResult(payload.Decision);
        return Ok();
    }
    // if not pending, persist decision and handle async/late responses (audit)
    PersistDecision(payload);
    return Accepted();
}

Notes

Use a durable queue or DB for tasks so late responses can be processed if in-memory TCS expired.
Use short-lived tokens in review links to ensure only authorized reviewers can access a task.
Consider implementing an audit-only mode where decisions are logged but not blocking.

Implementation areas & pros/cons (summary table)

Latency
- Agent-mediated: higher latency (extra hop) but supports richer collaboration.
- Orchestrator-direct: lower latency for basic flows.
Complexity & maintenance
- Agent-mediated: +service to operate and secure.
- Orchestrator-direct: +complexity in orchestrator codebase.
Reuse & UX
- Agent-mediated: reusable review UI for multiple orchestrations.
- Orchestrator-direct: UI tends to be bespoke per orchestration unless centralized.
Security
- Both require secure tokens and authorization; agent-mediated can centralize reviewer auth flows.
Scalability
- Agent-mediated: can scale independently (worker pools, autoscale).
- Orchestrator-direct: can become a bottleneck if orchestrator handles large review volumes.
Observability & auditing
- Agent-mediated: human agent can provide rich audit UI and collaboration artifacts.
- Orchestrator-direct: must build these features into orchestrator or a separate review UI service.

UX suggestions

Provide contextual evidence: snippets, original links, validation scores, and exact claims highlighted.
Allow inline edits or suggestions (not just binary approve/reject).
Record reviewer rationale (free text) for audits.
Show history and previous validations for the same document/claim.
Support bulk-approve for low-risk tasks to reduce reviewer fatigue.

Security & compliance

Enforce least-privilege tokens for the review UI. The UI should not fetch sources with broader scopes than original request.
Redact high-sensitivity fields by default; provide privileged reviewers with an explicit “show redacted” flow that is audited.
Keep evidence bundle tamper-evident: store SHA256 hashes and signatures.

Metrics to track

Time-to-review (p50/p95)
Review pass/warn/fail rates
Number of re-runs requested by reviewers
Reviewer workload and queue length
Actions blocked by human decision (prevented side-effects)

Quick checklist before production

Design review UI and secure it with SSO + RBAC
Define task envelope and callback contract
Add idempotency keys and audit trail for every side-effect
Decide agent-mediated vs orchestrator-direct based on expected scale and reuse
Implement escalation & SLA rules
Add metrics and alerts for stalled reviews

Human-in-the-loop flow (diagram)

The diagram below shows where a human reviewer is inserted, how the review task is created and resumed, and how approval or changes influence the next step of the orchestration.

If you prefer a sequence view of the same flow: