Auto-Fix
Automatic error recovery for failed workflow steps. When a step fails, AgentLed analyzes the error, applies a fix, and retries — reducing manual intervention.
How Auto-Fix Works
When a step fails, AgentLed runs a two-phase recovery before marking the execution as failed.
Phase 1Auto-Retry — Deterministic re-execution of the same step with the same config. Up to 3 attempts with exponential backoff. Handles transient errors: network timeouts, temporary API outages, rate-limit pauses.
Phase 2Auto-Fix — AI diagnosis. The system reads the error, inspects the step config and upstream data, and proposes a patch. Depending on your autonomyLevel, it either applies the patch automatically or waits for your approval.
Phase 3Failure — Both phases exhausted or disabled. The step is marked failed. Use retry_execution to manually re-run after fixing the root cause.
Configuration
{
// Phase 1: deterministic retry (default: disabled)
autoRetry: {
enabled: true,
maxAttempts: 3 // 1–3
},
// Phase 2: AI diagnosis and patching
autoFix: {
enabled: true,
maxAttempts: 2, // 1–5
autonomyLevel: "suggest" // "suggest" | "auto-fix"
}
}autonomyLevel: "suggest" — AI proposes a fix. The execution pauses with a “pending” status. You review the diagnosis and approve or reject.
autonomyLevel: "auto-fix" — AI applies the patch immediately and re-executes. The retry counter resets so the full retry budget applies to the corrected config.
What Auto-Fix Can and Cannot Patch
| Auto-fixable | Requires human review |
|---|---|
| Malformed API responses | Exhausted API quota |
| Input validation errors | Authentication failures (bad API key) |
| Missing optional fields | Business logic mismatches |
| Prompt / output format mismatches | Upstream data quality issues |
Auto-Fix cannot modify structural fields: step id, type, routing (next), loop config, or the Auto-Fix config itself. It only patches inputs, prompts, and output mappings.
Audit Trail
Every Auto-Fix attempt is logged in the execution timeline. You can see:
- •Which attempt number triggered the AI diagnosis
- •The AI's root-cause analysis and confidence score (0–1)
- •The exact config diff that was applied or proposed
- •Whether the patched step succeeded or failed again
Auto-Fix vs. Manual Retry
Use retry_execution when you want to re-run a failed execution after fixing the root cause yourself (updated an API key, corrected input data, adjusted a prompt). Manual retry always runs with the current saved config and gives you full control. Auto-Fix is for hands-off recovery of transient or AI-diagnosable errors.
Next Steps
- Human-in-the-Loop — Approval gates for high-stakes steps
- Validation — Catch data quality issues before they cause failures
- Batching — Per-item retry policies for large parallel jobs
