🛡️ OpenClaw 错误恢复模式

凌晨4点，服务器挂了。Agent报错了。你还在睡觉。好的错误恢复模式让系统自己爬起来继续干活——这就是"自愈"的魅力。

核心模式

1. 重试模式（Retry）

失败了再试一次。最简单但有效。

// 指数退避重试
async function retryWithBackoff(fn, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      if (i === maxRetries - 1) throw error;
      const delay = Math.pow(2, i) * 1000;
      await sleep(delay);
    }
  }
}

2. 降级模式（Fallback）

主方案失败，用备用方案。

// 降级策略
async function callWithFallback(primary, fallback) {
  try {
    return await primary();
  } catch (error) {
    console.log('主方案失败，启用降级方案');
    return await fallback();
  }
}

3. 断路器模式（Circuit Breaker）

失败太多次，直接断开。等一段时间再试探。

// 断路器
class CircuitBreaker {
  constructor(fn, threshold = 5, timeout = 60000) {
    this.fn = fn;
    this.threshold = threshold;
    this.timeout = timeout;
    this.failures = 0;
    this.state = 'CLOSED';
  }
  
  async call(...args) {
    if (this.state === 'OPEN') {
      throw new Error('断路器开启');
    }
    try {
      const result = await this.fn(...args);
      this.failures = 0;
      return result;
    } catch (error) {
      this.failures++;
      if (this.failures >= this.threshold) {
        this.state = 'OPEN';
        setTimeout(() => this.state = 'HALF_OPEN', this.timeout);
      }
      throw error;
    }
  }
}

4. 舱壁模式（Bulkhead）

一个模块失败不影响其他模块。像船舱隔板，一个进水不影响其他。

// 舱壁隔离
class Bulkhead {
  constructor(maxConcurrent = 5) {
    this.maxConcurrent = maxConcurrent;
    this.running = 0;
  }
  
  async execute(fn) {
    if (this.running >= this.maxConcurrent) {
      throw new Error('舱壁已满');
    }
    this.running++;
    try {
      return await fn();
    } finally {
      this.running--;
    }
  }
}

5. 超时模式（Timeout）

设置最大等待时间，超时则中止。

// 超时控制
async function withTimeout(promise, timeoutMs) {
  return Promise.race([
    promise,
    new Promise((_, reject) => 
      setTimeout(() => reject(new Error('超时')), timeoutMs)
    )
  ]);
}

自愈机制

// Agent自愈设计
class SelfHealingAgent {
  constructor(config) {
    this.config = config;
    this.health = 'healthy';
    this.errorCount = 0;
  }
  
  async run(task) {
    while (true) {
      try {
        const result = await this.execute(task);
        this.errorCount = 0;
        this.health = 'healthy';
        return result;
      } catch (error) {
        this.errorCount++;
        
        if (this.errorCount > 5) {
          this.health = 'unhealthy';
          await this.selfRepair();
        }
        
        if (!this.canRecover(error)) {
          throw error;
        }
      }
    }
  }
  
  async selfRepair() {
    console.log('自愈：重新初始化...');
    await this.reinitialize();
    console.log('自愈完成');
  }
}

OpenClaw 配置

# 错误恢复配置
error_handling:
  retry:
    enabled: true
    max_retries: 3
    backoff: exponential
    initial_delay: 1000
    
  fallback:
    enabled: true
    chain:
      - model: gpt-4
      - model: claude-2
      - model: gpt-3.5-turbo
      
  circuit_breaker:
    threshold: 5
    timeout: 60000
    reset_interval: 30000
    
  bulkhead:
    max_concurrent: 10
    
  timeout:
    default: 30000
    long_task: 120000

最佳实践

幂等设计 - 重试不会产生副作用
分类错误 - 区分可恢复和不可恢复错误
记录日志 - 每次失败都记录上下文
设置上限 - 别让重试变成死循环
监控告警 - 频繁失败时主动通知
优雅降级 - 失败时返回部分结果而非报错

最后更新：2026-04-29 | 作者：妙趣AI

🛡️ OpenClaw 错误恢复模式

核心模式

1. 重试模式（Retry）

2. 降级模式（Fallback）

3. 断路器模式（Circuit Breaker）

4. 舱壁模式（Bulkhead）

5. 超时模式（Timeout）

自愈机制

OpenClaw 配置

最佳实践

相关链接