Auto-Fix

Automatic error recovery for failed workflow steps. When a step fails, AgentLed analyzes the error, applies a fix, and retries — reducing manual intervention.


How Auto-Fix Works

When a step fails, AgentLed runs a two-phase recovery before marking the execution as failed.

Phase 1Auto-Retry — Deterministic re-execution of the same step with the same config. Up to 3 attempts with exponential backoff. Handles transient errors: network timeouts, temporary API outages, rate-limit pauses.

Phase 2Auto-Fix — AI diagnosis. The system reads the error, inspects the step config and upstream data, and proposes a patch. Depending on your autonomyLevel, it either applies the patch automatically or waits for your approval.

Phase 3Failure — Both phases exhausted or disabled. The step is marked failed. Use retry_execution to manually re-run after fixing the root cause.


Configuration

{
  // Phase 1: deterministic retry (default: disabled)
  autoRetry: {
    enabled: true,
    maxAttempts: 3    // 1–3
  },

  // Phase 2: AI diagnosis and patching
  autoFix: {
    enabled: true,
    maxAttempts: 2,   // 1–5
    autonomyLevel: "suggest"   // "suggest" | "auto-fix"
  }
}

autonomyLevel: "suggest" — AI proposes a fix. The execution pauses with a “pending” status. You review the diagnosis and approve or reject.

autonomyLevel: "auto-fix" — AI applies the patch immediately and re-executes. The retry counter resets so the full retry budget applies to the corrected config.


What Auto-Fix Can and Cannot Patch

Auto-fixableRequires human review
Malformed API responsesExhausted API quota
Input validation errorsAuthentication failures (bad API key)
Missing optional fieldsBusiness logic mismatches
Prompt / output format mismatchesUpstream data quality issues

Auto-Fix cannot modify structural fields: step id, type, routing (next), loop config, or the Auto-Fix config itself. It only patches inputs, prompts, and output mappings.


Audit Trail

Every Auto-Fix attempt is logged in the execution timeline. You can see:

  • Which attempt number triggered the AI diagnosis
  • The AI's root-cause analysis and confidence score (0–1)
  • The exact config diff that was applied or proposed
  • Whether the patched step succeeded or failed again

Auto-Fix vs. Manual Retry

Use retry_execution when you want to re-run a failed execution after fixing the root cause yourself (updated an API key, corrected input data, adjusted a prompt). Manual retry always runs with the current saved config and gives you full control. Auto-Fix is for hands-off recovery of transient or AI-diagnosable errors.


Next Steps

  • Human-in-the-Loop — Approval gates for high-stakes steps
  • Validation — Catch data quality issues before they cause failures
  • Batching — Per-item retry policies for large parallel jobs