OpenClaw状态机工作流 - 让Agent有章可循

凌晨2点47分，我在想一个问题：为什么有些Agent跑着跑着就迷路了？因为它没有状态。没有状态的Agent就像没有导航的出租车——知道要去目的地，但不知道现在在哪、下一步往哪走。

1. 什么是状态机工作流

状态机（State Machine）定义了Agent可能处于的所有状态，以及状态转换规则：

[空闲] --用户请求--> [分析中] --需要工具--> [执行工具]
^ |
| v
[完成] <--输出结果-- [生成回复]

2. 定义状态与转换

state_machine:
  name: task_agent
  
  states:
    idle:
      description: "等待任务"
      entry_action: reset_context
      
    analyzing:
      description: "分析用户意图"
      timeout: 30s
      on_timeout: "fallback"
      
    executing:
      description: "执行工具/操作"
      max_retries: 3
      
    responding:
      description: "生成回复"
      
    error:
      description: "错误状态"
      
  transitions:
    idle -> analyzing:
      trigger: user_message
      guard: "message_length > 0"
      
    analyzing -> executing:
      condition: "needs_tool_call"
      
    analyzing -> responding:
      condition: "can_answer_directly"
      
    executing -> responding:
      condition: "tool_call_success"
      
    executing -> error:
      condition: "all_retries_failed"
      
    responding -> idle:
      trigger: response_sent
      
    error -> idle:
      trigger: error_resolved
      action: log_error

3. Guard条件与触发器

# Guard：转换前的条件检查
transitions:
  idle -> payment:
    guard: |
      AND(
        user_authenticated,
        cart_not_empty,
        total > 0
      )
      
# Trigger：什么事件触发转换  
transitions:
  waiting -> processing:
    trigger: schedule  # 定时触发
    cron: "0 */2 * * *"
    
  processing -> done:
    trigger: condition  # 条件触发
    condition: "all_items_processed"

4. 状态持久化

state_machine:
  persistence:
    enabled: true
    
    # 什么时候保存
    save_on:
      - state_enter
      - state_exit
      - periodic  # 每60秒
      period: 60s
      
    # 存储位置
    storage: redis
    key: "sm:${agent_id}:${session_id}"

5. 复杂状态模式

5.1 并行状态

parallel_states:
  name: multi_step_agent
  
  states:
    main: [idle, processing, done]
    
    # 并行的子状态机
    sub_machines:
      validation:
        states: [pending, valid, invalid]
      enrichment:
        states: [pending, enriching, complete]
        
    # 主状态和子状态独立运行
    sync: false

5.2 层级状态

hierarchical:
  active:  # 父状态
    substates:
      - reading_input
      - processing
      - generating_output
      
    # 任何子状态都可以跳到父状态的error
    parent_transitions:
      active -> error:  # 从任意子状态到error
        trigger: critical_error

6. 可视化与调试

# 生成状态机可视化
openclaw state-machine visualize --agent my_agent --output diagram.svg

# 当前状态查询
openclaw state-machine status --session-id sess_abc123

# 状态变更历史
openclaw state-machine history --session-id sess_abc123

7. 实战：订单处理状态机

state_machine:
  name: order_agent
  
  states:
    - receiving
    - validating
    - payment_pending
    - payment_processing
    - fulfillment
    - completed
    - cancelled
    
  transitions:
    receiving -> validating: 
      trigger: order_submitted
    validating -> payment_pending:
      condition: validation_passed
    validating -> cancelled:
      condition: validation_failed
    payment_pending -> payment_processing:
      trigger: payment_initiated
    payment_processing -> fulfillment:
      condition: payment_success
    payment_processing -> cancelled:
      condition: payment_failed
      after: 3_retries
    fulfillment -> completed:
      trigger: items_shipped

8. 最佳实践

✅ 状态数量控制在5-10个之间
✅ 每个转换必须有明确的触发条件
✅ 总是定义error状态和恢复路径
✅ 关键状态开启持久化
✅ 使用可视化工具验证逻辑
✅ 为每个状态设置合理的超时

💡 状态机适合确定性强的流程。对于自由对话场景，考虑使用更灵活的Workflow。