Issue #16: Production-Grade AI Agents with Microsoft Agent Framework and Deterministic Guardrails

9 min read | February 21, 2026

Most AI agent examples focus on capability: an agent can reason, call tools, or coordinate multiple steps. That is not the hard part. The hard part begins when the agent runs inside a real system with real users and real incidents, where the question is no longer "can the agent respond," but "can the system trust the agent's behavior over time."

In this issue we build a local first incident triage agent using Microsoft Agent Framework. The triage scenario is practical, but the real focus is the control model that makes agents usable inside deterministic software systems. The goal is to keep the LLM useful while removing its authority over correctness, safety, and execution.

What You Are Building

A single console app with a clear, production shaped pipeline:

Validate configuration and start the runtime
Build a bounded incident prompt from user input
Run a sequential agent workflow that produces a typed triage report
Parse the structured output defensively and fail fast on malformed JSON
Apply deterministic policy enforcement for severity and domain
Print a consistent triage report plus an SLA hint
Fallback to text mode when structured mode fails
Optional streaming mode for interactive runs

This is not agentic autonomy. It is controlled flow.

System Structure

The system is a deterministic shell around a probabilistic core. Incident input flows through a controlled agent workflow that produces structured output, which is validated and constrained by policy before any decision is made. The agent proposes context. The system owns control, boundaries, and outcomes.

The diagram below shows the controlled flow from incident input to structured output, deterministic enforcement, and bounded operational decisions.

Microsoft Agent Framework as the Orchestration Layer

This project uses Microsoft Agent Framework as a production friendly shell around an LLM:

ChatClientAgent for defining bounded agent behavior
AgentWorkflowBuilder for explicit sequential orchestration
AgentRunResponse for output inspection
Structured and streaming execution paths
Provider agnostic model integration through IChatClient

The API surface is secondary. The control model is the point.

Agents in Production Are Not Chatbots

A production agent is defined by responsibility, not conversation.

A chatbot reacts to messages. A production agent operates within contracts, boundaries, and policies defined outside the model.

In practice:

The agent does not own control flow
The agent does not decide severity, escalation, or execution
The agent proposes structured output
Deterministic code validates, corrects, and applies policy

This pattern applies beyond triage: security review, deployment validation, support routing, and operational decision support.

Local First by Design

The runtime is local first. It uses an OpenAI compatible chat client pointing at Ollama, so development and testing work without cloud dependencies.

Configuration is explicit and validated before any agent runs:

public sealed class AgentAppConfig
{
  public string Provider { get; init; } = "ollama";
  public string BaseUrl { get; init; } = "http://localhost:11434/v1";
  public string ApiKey { get; init; } = "ollama";
  public string ModelId { get; init; } = "mistral:7b";
  public float Temperature { get; init; } = 0.1f;
  public int MaxOutputTokens { get; init; } = 800;
}

The app also supports environment variable overrides using the ITA_ prefix, which is useful for CI and ops.

Local execution keeps failures visible and reproducible. It also forces designs that remain reliable under weaker, less consistent models.

Explicit Contracts, Not Prose

If output drives downstream decisions, it must be typed and validated.

The triage agent returns a strongly typed report that becomes the boundary between probabilistic generation and deterministic enforcement:

public sealed class IncidentTriageReport
{
  public string IncidentSummary { get; set; }
  public string Severity { get; set; }
  public string PrimaryDomain { get; set; }
  public string CustomerImpact { get; set; }
  public List<string> TopLikelyCauses { get; set; }
  public List<string> ImmediateActions15Minutes { get; set; }
  public List<string> StabilizationActions60Minutes { get; set; }
  public List<string> EscalationTargets { get; set; }
  public string StakeholderUpdateDraft { get; set; }
  public List<string> MissingCriticalData { get; set; }
}

Free form text is not a contract. Structured output is.

Sequential Workflows Without Autonomy

Multi agent design is useful only when responsibilities are narrow and explicit.

This system separates creativity from validation:

An authoring agent produces the first triage report
A reviewer agent validates and corrects output shape and operational quality
A text mode agent is available as a fallback path

The workflow is sequential and controlled:

AIAgent structuredWorkflowAgent = await AgentWorkflowBuilder
  .BuildSequential(triageAuthoringAgent, triageReviewAgent)
  .AsAgentAsync();

The reviewer is designed to be strict and boring. That is what reduces production failures.

Instructions as Behavioral Infrastructure

Instructions are treated as infrastructure, not prompts.

The structured mode instructions enforce:

JSON only output with a fixed schema
No invented metrics, logs, or observability data
Explicit missing data listing when confidence is low
Severity limited to P1 to P4
Domain limited to an allowlisted set

public static string BuildStructuredInstructions() =>
"""
You are an Incident Triage Agent for real production operations.

Non-negotiable rules:
1. Never invent observability data, logs, or metrics that were not provided.
2. If key information is missing, list it explicitly under missing data.
3. Prioritize immediate risk reduction and customer impact containment.
4. Keep actions executable by an on-call engineer.
5. Use severity levels P1, P2, P3, or P4 only.

Return ONLY valid JSON with this exact shape:
{
  "incidentSummary": "string",
  "severity": "P1|P2|P3|P4",
  "primaryDomain": "API|Database|Queue|Networking|Compute|Storage|Identity|ThirdPartyDependency|Unknown",
  "customerImpact": "string",
  ...
}
""";

The reviewer instructions are even tighter and explicitly forbid invention.

Defensive Parsing and Explicit Failure Modes

Structured output parsing is defensive. If the model returns malformed JSON, the system fails fast and switches to a visible fallback path.

private static string ExtractJsonObject(string input)
{
  var start = input.IndexOf('{');
  var end = input.LastIndexOf('}');
  if (start < 0 || end <= start)
      throw new InvalidOperationException();
  return input[start..(end + 1)];
}

Explicit failure is safer than silent degradation.

The runtime behavior is intentional:

Structured mode preferred
If structured parse fails, fall back to text mode
Streaming mode is an explicit user command, not a hidden switch

Deterministic Policy Outside the Agent

This is the highest leverage reliability pattern in the project.

The agent proposes a triage report. Deterministic code owns the final decisions.

After the structured report is parsed, the system enforces severity and domain using a policy engine:

Severity normalization based on signals found in the incident input and summary
Extraction of percentages near keywords like error rate and saturation
Extraction of latency values in milliseconds
Domain normalization using an allowlist plus keyword based mapping

var typedResponse = await triageService.RunStructuredAsync(prompt);
var policyAdjusted = TriagePolicyEnforcer.Apply(incidentInput, typedResponse);
PrintReport(policyAdjusted);

If the agent claims P3 but the incident text contains 18% error rate and 4.8s p95 latency, the deterministic policy overrides to P1 or P2 based on explicit thresholds.

The system remains explainable and stable across model changes because the authority lives in code.

Post Processing That Operations Teams Actually Want

After printing the report, the app prints an SLA hint based on the normalized severity:

P1 pages primary and secondary on call immediately
P2 engages the on call owner and incident commander
P3 queues remediation in active sprint
P4 tracks as non urgent reliability work

This is a deterministic output that makes the tool usable during real incidents.

Runtime Modes

This agent supports two explicit modes.

Structured mode (preferred):

Author agent produces JSON
Reviewer agent validates and corrects
Deterministic policy enforcer normalizes severity and domain
Output is printed in a consistent format

Text mode (fallback):

If structured mode fails on a weak model, the system prints a visible warning
The text agent returns concise Markdown
Optional streaming mode is available via /stream

No silent mode switching. No hidden behavior changes.

Why This Architecture Works

This project is small, but the structure maps directly to production constraints:

Deterministic code owns control flow
Agents produce proposals, not actions
Structured contracts replace free form output
Defensive parsing plus explicit fallbacks prevent silent degradation
Policy enforcement lives outside the model
Local first runtime keeps behavior reproducible
Microsoft Agent Framework provides orchestration without forcing autonomy

The model remains valuable, but it is never the authority.

Potential Enhancements

You can extend this foundation incrementally:

Add an evaluation harness to gate regressions on triage quality and policy adherence
Add telemetry for agent latency, structured parse failures, and fallback rate
Add incident categories and adversarial cases (prompt injection, exfiltration attempts)
Persist triage outputs for regression tracking across model upgrades
Add a tool allowlist for safe, bounded lookups (runbooks, service ownership, known incidents)

None of these change the core structure. They strengthen it.

Final Notes

Production AI agents are not built by making models smarter. They are built by constraining uncertainty and enforcing discipline around probabilistic components.

If your agent can decide severity, routing, or execution on its own, you are gambling with system behavior. If your agent operates inside deterministic constraints, it becomes a reliable component.

Explore the source code at the GitHub repository.

See you in the next issue.

Stay curious.

Share this article with your network.

LinkedIn X Facebook

Join the Newsletter

Subscribe for AI engineering insights, system design strategies, and workflow tips.

Your information is safe. Unsubscribe anytime.