
Most AI agent examples focus on capability: an agent can reason, call tools, or coordinate multiple steps. That is not the hard part. The hard part begins when the agent runs inside a real system with real users and real incidents, where the question is no longer "can the agent respond," but "can the system trust the agent's behavior over time."
In this issue we build a local first incident triage agent using Microsoft Agent Framework. The triage scenario is practical, but the real focus is the control model that makes agents usable inside deterministic software systems. The goal is to keep the LLM useful while removing its authority over correctness, safety, and execution.
What You Are Building
A single console app with a clear, production shaped pipeline:
- Validate configuration and start the runtime
- Build a bounded incident prompt from user input
- Run a sequential agent workflow that produces a typed triage report
- Parse the structured output defensively and fail fast on malformed JSON
- Apply deterministic policy enforcement for severity and domain
- Print a consistent triage report plus an SLA hint
- Fallback to text mode when structured mode fails
- Optional streaming mode for interactive runs
This is not agentic autonomy. It is controlled flow.
System Structure
The system is a deterministic shell around a probabilistic core. Incident input flows through a controlled agent workflow that produces structured output, which is validated and constrained by policy before any decision is made. The agent proposes context. The system owns control, boundaries, and outcomes.
The diagram below shows the controlled flow from incident input to structured output, deterministic enforcement, and bounded operational decisions.
Microsoft Agent Framework as the Orchestration Layer
This project uses Microsoft Agent Framework as a production friendly shell around an LLM:
ChatClientAgentfor defining bounded agent behaviorAgentWorkflowBuilderfor explicit sequential orchestrationAgentRunResponsefor output inspection- Structured and streaming execution paths
- Provider agnostic model integration through
IChatClient
The API surface is secondary. The control model is the point.
Agents in Production Are Not Chatbots
A production agent is defined by responsibility, not conversation.
A chatbot reacts to messages. A production agent operates within contracts, boundaries, and policies defined outside the model.
In practice:
- The agent does not own control flow
- The agent does not decide severity, escalation, or execution
- The agent proposes structured output
- Deterministic code validates, corrects, and applies policy
This pattern applies beyond triage: security review, deployment validation, support routing, and operational decision support.
Local First by Design
The runtime is local first. It uses an OpenAI compatible chat client pointing at Ollama, so development and testing work without cloud dependencies.
Configuration is explicit and validated before any agent runs:
public sealed class AgentAppConfig
{
public string Provider { get; init; } = "ollama";
public string BaseUrl { get; init; } = "http://localhost:11434/v1";
public string ApiKey { get; init; } = "ollama";
public string ModelId { get; init; } = "mistral:7b";
public float Temperature { get; init; } = 0.1f;
public int MaxOutputTokens { get; init; } = 800;
}The app also supports environment variable overrides using the ITA_ prefix, which is useful for CI and ops.
Local execution keeps failures visible and reproducible. It also forces designs that remain reliable under weaker, less consistent models.
Explicit Contracts, Not Prose
If output drives downstream decisions, it must be typed and validated.
The triage agent returns a strongly typed report that becomes the boundary between probabilistic generation and deterministic enforcement:
public sealed class IncidentTriageReport
{
public string IncidentSummary { get; set; }
public string Severity { get; set; }
public string PrimaryDomain { get; set; }
public string CustomerImpact { get; set; }
public List<string> TopLikelyCauses { get; set; }
public List<string> ImmediateActions15Minutes { get; set; }
public List<string> StabilizationActions60Minutes { get; set; }
public List<string> EscalationTargets { get; set; }
public string StakeholderUpdateDraft { get; set; }
public List<string> MissingCriticalData { get; set; }
}Free form text is not a contract. Structured output is.
Sequential Workflows Without Autonomy
Multi agent design is useful only when responsibilities are narrow and explicit.
This system separates creativity from validation:
- An authoring agent produces the first triage report
- A reviewer agent validates and corrects output shape and operational quality
- A text mode agent is available as a fallback path
The workflow is sequential and controlled:
AIAgent structuredWorkflowAgent = await AgentWorkflowBuilder
.BuildSequential(triageAuthoringAgent, triageReviewAgent)
.AsAgentAsync();The reviewer is designed to be strict and boring. That is what reduces production failures.
Instructions as Behavioral Infrastructure
Instructions are treated as infrastructure, not prompts.
The structured mode instructions enforce:
- JSON only output with a fixed schema
- No invented metrics, logs, or observability data
- Explicit missing data listing when confidence is low
- Severity limited to P1 to P4
- Domain limited to an allowlisted set
public static string BuildStructuredInstructions() =>
"""
You are an Incident Triage Agent for real production operations.
Non-negotiable rules:
1. Never invent observability data, logs, or metrics that were not provided.
2. If key information is missing, list it explicitly under missing data.
3. Prioritize immediate risk reduction and customer impact containment.
4. Keep actions executable by an on-call engineer.
5. Use severity levels P1, P2, P3, or P4 only.
Return ONLY valid JSON with this exact shape:
{
"incidentSummary": "string",
"severity": "P1|P2|P3|P4",
"primaryDomain": "API|Database|Queue|Networking|Compute|Storage|Identity|ThirdPartyDependency|Unknown",
"customerImpact": "string",
...
}
""";The reviewer instructions are even tighter and explicitly forbid invention.
Defensive Parsing and Explicit Failure Modes
Structured output parsing is defensive. If the model returns malformed JSON, the system fails fast and switches to a visible fallback path.
private static string ExtractJsonObject(string input)
{
var start = input.IndexOf('{');
var end = input.LastIndexOf('}');
if (start < 0 || end <= start)
throw new InvalidOperationException();
return input[start..(end + 1)];
}Explicit failure is safer than silent degradation.
The runtime behavior is intentional:
- Structured mode preferred
- If structured parse fails, fall back to text mode
- Streaming mode is an explicit user command, not a hidden switch
Deterministic Policy Outside the Agent
This is the highest leverage reliability pattern in the project.
The agent proposes a triage report. Deterministic code owns the final decisions.
After the structured report is parsed, the system enforces severity and domain using a policy engine:
- Severity normalization based on signals found in the incident input and summary
- Extraction of percentages near keywords like error rate and saturation
- Extraction of latency values in milliseconds
- Domain normalization using an allowlist plus keyword based mapping
var typedResponse = await triageService.RunStructuredAsync(prompt);
var policyAdjusted = TriagePolicyEnforcer.Apply(incidentInput, typedResponse);
PrintReport(policyAdjusted);If the agent claims P3 but the incident text contains 18% error rate and 4.8s p95 latency, the deterministic policy overrides to P1 or P2 based on explicit thresholds.
The system remains explainable and stable across model changes because the authority lives in code.
Post Processing That Operations Teams Actually Want
After printing the report, the app prints an SLA hint based on the normalized severity:
- P1 pages primary and secondary on call immediately
- P2 engages the on call owner and incident commander
- P3 queues remediation in active sprint
- P4 tracks as non urgent reliability work
This is a deterministic output that makes the tool usable during real incidents.
Runtime Modes
This agent supports two explicit modes.
Structured mode (preferred):
- Author agent produces JSON
- Reviewer agent validates and corrects
- Deterministic policy enforcer normalizes severity and domain
- Output is printed in a consistent format
Text mode (fallback):
- If structured mode fails on a weak model, the system prints a visible warning
- The text agent returns concise Markdown
- Optional streaming mode is available via
/stream
No silent mode switching. No hidden behavior changes.
Why This Architecture Works
This project is small, but the structure maps directly to production constraints:
- Deterministic code owns control flow
- Agents produce proposals, not actions
- Structured contracts replace free form output
- Defensive parsing plus explicit fallbacks prevent silent degradation
- Policy enforcement lives outside the model
- Local first runtime keeps behavior reproducible
- Microsoft Agent Framework provides orchestration without forcing autonomy
The model remains valuable, but it is never the authority.
Potential Enhancements
You can extend this foundation incrementally:
- Add an evaluation harness to gate regressions on triage quality and policy adherence
- Add telemetry for agent latency, structured parse failures, and fallback rate
- Add incident categories and adversarial cases (prompt injection, exfiltration attempts)
- Persist triage outputs for regression tracking across model upgrades
- Add a tool allowlist for safe, bounded lookups (runbooks, service ownership, known incidents)
None of these change the core structure. They strengthen it.
Final Notes
Production AI agents are not built by making models smarter. They are built by constraining uncertainty and enforcing discipline around probabilistic components.
If your agent can decide severity, routing, or execution on its own, you are gambling with system behavior. If your agent operates inside deterministic constraints, it becomes a reliable component.
Explore the source code at the GitHub repository.
See you in the next issue.
Stay curious.
Join the Newsletter
Subscribe for AI engineering insights, system design strategies, and workflow tips.