Human-in-the-Loop — design patterns and implementation guidance
Human-in-the-Loop — when and how to involve people
Bringing a human reviewer into AI orchestrations improves safety, correctness, and trust. Humans can be introduced as an explicit approval gate, a remediation step for warnings/fails, or a quality-sampling reviewer. This page covers two main integration patterns and practical implementation guidance, followed by suggestions and trade-offs.
Two integration patterns
- Agent-mediated human-in-the-loop (Human-Agent)
- A dedicated Human Agent acts as a mediation layer. The orchestrator delegates a review task to the Human Agent using the A2A protocol. The Human Agent presents a UI (web/Teams/email) to the reviewer, collects an action (approve/reject/edit), and returns structured results to the orchestrator.
- Orchestrator-mediated human-in-the-loop (Orchestrator-Direct)
- The orchestrator itself creates a human-review task, notifies the reviewer (push notification, Teams, or email), and waits for a response via callback or polling. No separate agent is required; the orchestrator handles the lifecycle.
Both patterns are valid — choose based on architecture, scaling, and separation-of-concerns.
When to call a human
- Validation record status = fail (block downstream side-effects).
- Validation status = warn with high-impact trigger (regulatory, finance, legal).
- High-risk triggers (e.g., send-to-all, release notes, policy changes).
- Manual sampling for QA and ongoing model calibration.
Implementation guide — common building blocks
- Task envelope: extend your A2A envelope/control metadata with: trigger_id, review_type, review_deadline, reviewer_role, callback_url, escalate_after_seconds.
- Idempotency: all side-effectful operations must include idempotency_key so retries or double-submits are safe.
- Audit trail: persist the entire envelope, producer result, validation record, reviewer decisions, and timestamps in an immutable audit store.
- Escalation: define automatic escalation rules (no response -> escalate to manager or queue -> block after N hours).
- UI: reviewers need the summary, cited evidence (snippets + links), confidence scores, and suggested remediation actions (approve, request-edit, redact, escalate).
- Minimal UI elements: accept/approve, request-changes (text), annotate snippet (redact/highlight), re-run (invoke producer again), and final decision.
- Credentials & access: reviewers must be authenticated (SSO) and authorized; the orchestrator must validate that the reviewer had the same or broader access as the original requester.
Agent-mediated (Human Agent) — steps & sample flow
Pros
- Clear separation of concerns: agents remain small and focused.
- Reuse: the Human Agent can provide consistent UI/UX across orchestrations.
- Easier to scale the review channel separately.
- Can offer richer collaboration features (comments, threaded discussion).
Cons
- Additional service to deploy and secure.
- Extra network hop and complexity in tracing.
Flow
- Orchestrator composes the review envelope with producer result + validation record + trigger meta.
- Orchestrator delegates to
human-agentvia A2A (invoke endpoint) withresponse_mode: asyncandcallback_urlset. - Human Agent displays UI, waits for reviewer action.
- Reviewer chooses action; Human Agent posts decision to
callback_urlor calls the orchestrator’s resume endpoint. - Orchestrator resumes workflow and records the decision.
Orchestrator-mediated (Direct) — steps & sample flow
Pros
- Fewer moving parts and easier to reason about.
- Lower deployment surface (no separate Human Agent service).
- Potentially lower latency for simple reviews.
Cons
- The orchestrator grows larger and more complex.
- Harder to reuse review UI across domains.
- Scaling review workflows may be harder inside a monolith.
Flow
- Orchestrator stores producer result + evidence and creates a
review_taskrecord (DB/queue). - Orchestrator notifies reviewers (Teams message, email, in-app) containing a secure review link.
- Review UI (a small service or server-side page) fetches the review task and posts decision to orchestrator’s resume endpoint.
- Orchestrator resumes workflow based on decision.
Recommended message/task shape for human review (JSON)
{
"task_id": "t-789",
"correlation_id": "c-999",
"producer": {"agent_id":"knowledge-agent","result_id":"r-34"},
"validation_id": "val-123",
"trigger_id": "summarize-project",
"review_type": "approval",
"review_deadline": "2025-11-05T12:00:00Z",
"reviewer_role": "manager",
"callback_url": "https://orchestrator/api/review/resume",
"ui_hint": {"show_snippets": true, "highlight_terms": ["decision","deadline"]}
}
C# orchestrator sketch — create review task and wait for callback
// Simplified: Create a review task and wait for a callback via TaskCompletionSource keyed by task_id.
public class ReviewTaskRecord { public string TaskId; public string CorrelationId; public string CallbackUrl; /*...*/ }
private static readonly ConcurrentDictionary<string, TaskCompletionSource<ReviewDecision>> _pending = new();
public async Task<ReviewDecision> RequestHumanReviewAsync(ProducerResult result, ValidationRecord vr, string[] reviewers, TimeSpan timeout)
{
var taskId = Guid.NewGuid().ToString();
var reviewRecord = new ReviewTaskRecord { TaskId = taskId, CorrelationId = vr.CorrelationId, CallbackUrl = "https://orchestrator/api/review/resume" };
// persist reviewRecord to DB
// notify reviewers (Teams/email) with secure link that contains taskId and short-lived token
NotifyReviewers(reviewers, taskId);
var tcs = new TaskCompletionSource<ReviewDecision>(TaskCreationOptions.RunContinuationsAsynchronously);
_pending[taskId] = tcs;
using (var cts = new CancellationTokenSource(timeout))
{
using (cts.Token.Register(() => tcs.TrySetCanceled()))
{
try { return await tcs.Task.ConfigureAwait(false); }
finally { _pending.Remove(taskId, out _); }
}
}
}
// Resume endpoint called by UI/HumanAgent
[HttpPost("/api/review/resume")]
public IActionResult ResumeReview([FromBody] ReviewCallback payload)
{
// validate token, reviewer identity, permissions
if (_pending.TryGetValue(payload.TaskId, out var tcs))
{
tcs.TrySetResult(payload.Decision);
return Ok();
}
// if not pending, persist decision and handle async/late responses (audit)
PersistDecision(payload);
return Accepted();
}
Notes
- Use a durable queue or DB for tasks so late responses can be processed if in-memory TCS expired.
- Use short-lived tokens in review links to ensure only authorized reviewers can access a task.
- Consider implementing an audit-only mode where decisions are logged but not blocking.
Implementation areas & pros/cons (summary table)
-
Latency
- Agent-mediated: higher latency (extra hop) but supports richer collaboration.
- Orchestrator-direct: lower latency for basic flows.
-
Complexity & maintenance
- Agent-mediated: +service to operate and secure.
- Orchestrator-direct: +complexity in orchestrator codebase.
-
Reuse & UX
- Agent-mediated: reusable review UI for multiple orchestrations.
- Orchestrator-direct: UI tends to be bespoke per orchestration unless centralized.
-
Security
- Both require secure tokens and authorization; agent-mediated can centralize reviewer auth flows.
-
Scalability
- Agent-mediated: can scale independently (worker pools, autoscale).
- Orchestrator-direct: can become a bottleneck if orchestrator handles large review volumes.
-
Observability & auditing
- Agent-mediated: human agent can provide rich audit UI and collaboration artifacts.
- Orchestrator-direct: must build these features into orchestrator or a separate review UI service.
UX suggestions
- Provide contextual evidence: snippets, original links, validation scores, and exact claims highlighted.
- Allow inline edits or suggestions (not just binary approve/reject).
- Record reviewer rationale (free text) for audits.
- Show history and previous validations for the same document/claim.
- Support bulk-approve for low-risk tasks to reduce reviewer fatigue.
Security & compliance
- Enforce least-privilege tokens for the review UI. The UI should not fetch sources with broader scopes than original request.
- Redact high-sensitivity fields by default; provide privileged reviewers with an explicit “show redacted” flow that is audited.
- Keep evidence bundle tamper-evident: store SHA256 hashes and signatures.
Metrics to track
- Time-to-review (p50/p95)
- Review pass/warn/fail rates
- Number of re-runs requested by reviewers
- Reviewer workload and queue length
- Actions blocked by human decision (prevented side-effects)
Quick checklist before production
- Design review UI and secure it with SSO + RBAC
- Define task envelope and callback contract
- Add idempotency keys and audit trail for every side-effect
- Decide agent-mediated vs orchestrator-direct based on expected scale and reuse
- Implement escalation & SLA rules
- Add metrics and alerts for stalled reviews
Human-in-the-loop flow (diagram)
The diagram below shows where a human reviewer is inserted, how the review task is created and resumed, and how approval or changes influence the next step of the orchestration.
If you prefer a sequence view of the same flow: