一个Agent干活太慢?那就一百个一起干。但怎么让它们不互相打架,是个技术活。
分布式Agent协调是OpenClaw面向大规模任务的集群编排能力。当单个Agent处理不过来时(比如需要同时监控1000个网站、处理10000个文档),你可以部署多个Agent节点,通过OpenClaw的协调机制实现任务分发、负载均衡和故障转移。就像一支交响乐队——每个乐手独立演奏,但所有人在指挥下形成完美的和声。
# cluster-config.yaml
cluster:
name: "miaoquai-prod"
region: "asia-east-1"
nodes:
- id: "node-01"
host: "10.0.1.1"
role: "coordinator" # 主节点
capacity: 20
- id: "node-02"
host: "10.0.1.2"
role: "worker"
capacity: 10
- id: "node-03"
host: "10.0.1.3"
role: "worker"
capacity: 10
consensus:
protocol: "raft" # raft | paxos | custom
electionTimeout: "5s"
heartbeatInterval: "1s"
discovery:
method: "static" # static | dns | k8s
nodes: ["node-01:8080", "node-02:8080", "node-03:8080"]
# task-distribution-config.yaml
distribution:
strategy: "adaptive" # round-robin | hash | least-loaded | adaptive
adaptive:
rebalanceInterval: "30s"
threshold: 0.8 # 负载超过80%时重新分配
taskQueue:
type: "priority"
maxPending: 10000
retryPolicy:
maxAttempts: 3
backoff: "exponential"
maxBackoff: "5m"
routing:
affinity: true # 同类型任务倾向分配到同一节点
antiAffinity: false
locality: "zone" # same-zone | cross-zone | any
# fault-tolerance-config.yaml
faultTolerance:
healthCheck:
interval: "10s"
timeout: "5s"
unhealthyThreshold: 3
healthyThreshold: 2
failover:
strategy: "active-passive" # active-active | active-passive
failoverTime: "30s"
taskMigration:
enabled: true
migrateInProgress: true # 迁移进行中的任务
checkpointInterval: "1m"
circuitBreaker:
enabled: true
failureThreshold: 5
recoveryTimeout: "60s"
halfOpenRequests: 3
| 任务规模 | 节点数 | 推荐架构 |
|---|---|---|
| 轻量(<100任务/天) | 1-2节点 | 单节点 + 冷备 |
| 中等(100-1000/天) | 3-5节点 | Raft集群 |
| 大规模(>1000/天) | 5-20节点 | 多区域 + 分片 |
const { OpenClawCluster, TaskRouter } = require('@openclaw/cluster');
async function distributeTask(task) {
const cluster = new OpenClawCluster({
configPath: './cluster-config.yaml',
autoDiscovery: true
});
// 1. 获取集群状态
const status = await cluster.getStatus();
console.log(`集群节点: ${status.activeNodes}/${status.totalNodes}`);
console.log(`活跃任务: ${status.activeTasks}`);
// 2. 智能路由
const router = new TaskRouter(cluster);
const targetNode = await router.route(task, {
strategy: 'least-loaded',
constraints: {
requireGPU: task.needsGPU,
preferSameZone: true,
maxLatency: '100ms'
}
});
// 3. 分发任务
const job = await cluster.submitTask(targetNode, {
type: task.type,
payload: task.payload,
priority: task.priority || 'normal',
timeout: task.timeout || '10m',
checkpoint: true,
onProgress: (progress) => {
console.log(`进度: ${progress.percent}% (${progress.message})`);
}
});
// 4. 等待结果(支持故障转移)
const result = await job.wait({
failover: true,
timeout: task.timeout || '10m'
});
return result;
}
async function mapReduce(cluster, documents, mapFn, reduceFn) {
// Map阶段:分发到各节点并行处理
const mapTasks = documents.map((doc, i) => ({
id: `map-${i}`,
type: 'map',
payload: { document: doc, mapFunction: mapFn },
priority: 'normal'
}));
// 并行提交所有map任务
const mapResults = await Promise.allSettled(
mapTasks.map(task => cluster.submitTask(task))
);
// 收集成功的map结果
const successfulMaps = mapResults
.filter(r => r.status === 'fulfilled')
.map(r => r.value.data);
console.log(`Map完成: ${successfulMaps.length}/${mapTasks.length}`);
// Reduce阶段:合并结果
const reduceTask = await cluster.submitTask({
id: 'reduce-0',
type: 'reduce',
payload: {
inputs: successfulMaps,
reduceFunction: reduceFn
},
priority: 'high',
requireNode: 'coordinator' // 在主节点执行reduce
});
const finalResult = await reduceTask.wait();
return finalResult.data;
}
📅 更新时间:2026-05-11 | 📖 更多OpenClaw教程请访问 工具教程索引