🏭 OpenClaw 企业级部署完全指南
凌晨4点02分,你被电话吵醒:"生产环境又挂了"...
这篇文章教你用正确的方式部署OpenClaw,让你的AI Agent在生产环境中稳如老狗。
部署架构概览
| 方案 | 复杂度 | 适用规模 | 特点 |
|---|---|---|---|
| Docker | ⭐ | 小规模 | 简单易用,快速部署 |
| Docker Compose | ⭐⭐ | 中小规模 | 多服务编排 |
| Kubernetes | ⭐⭐⭐⭐⭐ | 大规模/生产 | 高可用,自动扩缩容 |
Docker 部署
获取镜像
# 从Docker Hub拉取
docker pull openclaw/openclaw:latest
# 或使用特定版本
docker pull openclaw/openclaw:2.4.0
基础运行
docker run -d \
--name openclaw \
-p 18789:18789 \
-v ~/.openclaw:/root/.openclaw \
-e ANTHROPIC_API_KEY=your-api-key \
openclaw/openclaw:latest
生产级配置
docker run -d \
--name openclaw \
--restart unless-stopped \
-p 18789:18789 \
-p 18788:18788 \
-v ~/.openclaw:/root/.openclaw:ro \
-v openclaw-logs:/var/log/openclaw \
-v openclaw-data:/data \
-e ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} \
-e OPENCLAW_CONFIG=/config/openclaw.json \
-e LOG_LEVEL=info \
-e NODE_ENV=production \
--health-cmd="curl -f http://localhost:18789/health || exit 1" \
--health-interval=30s \
--health-timeout=10s \
--health-retries=3 \
openclaw/openclaw:latest
Docker Compose 部署
docker-compose.yml
version: '3.8'
services:
openclaw:
image: openclaw/openclaw:latest
container_name: openclaw
restart: unless-stopped
ports:
- "18789:18789"
- "18788:18788"
volumes:
- ./config:/config:ro
- ./data:/data
- ./logs:/var/log/openclaw
environment:
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- OPENCLAW_CONFIG=/config/openclaw.json
- LOG_LEVEL=info
- NODE_ENV=production
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:18789/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
networks:
- openclaw-network
# 可选:日志聚合
loki:
image: grafana/loki:latest
container_name: openclaw-loki
ports:
- "3100:3100"
volumes:
- ./loki-config.yml:/etc/loki/local-config.yaml:ro
- loki-data:/loki
networks:
- openclaw-network
# 可选:监控
prometheus:
image: prom/prometheus:latest
container_name: openclaw-prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
- prometheus-data:/prometheus
networks:
- openclaw-network
networks:
openclaw-network:
driver: bridge
volumes:
loki-data:
prometheus-data:
# 启动所有服务
docker-compose up -d
# 查看日志
docker-compose logs -f openclaw
# 停止所有服务
docker-compose down
Kubernetes 部署
Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: openclaw
labels:
app: openclaw
spec:
replicas: 3
selector:
matchLabels:
app: openclaw
template:
metadata:
labels:
app: openclaw
spec:
containers:
- name: openclaw
image: openclaw/openclaw:latest
ports:
- containerPort: 18789
name: http
- containerPort: 18788
name: ws
env:
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef:
name: openclaw-secrets
key: api-key
- name: OPENCLAW_CONFIG
value: "/config/openclaw.json"
- name: NODE_ENV
value: "production"
volumeMounts:
- name: config
mountPath: /config
readOnly: true
- name: data
mountPath: /data
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "2000m"
livenessProbe:
httpGet:
path: /health
port: 18789
initialDelaySeconds: 60
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 18789
initialDelaySeconds: 30
periodSeconds: 10
volumes:
- name: config
configMap:
name: openclaw-config
- name: data
persistentVolumeClaim:
claimName: openclaw-data
Service
apiVersion: v1
kind: Service
metadata:
name: openclaw
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 18789
protocol: TCP
name: http
selector:
app: openclaw
---
apiVersion: v1
kind: Service
metadata:
name: openclaw-lb
spec:
type: LoadBalancer
ports:
- port: 18789
targetPort: 18789
protocol: TCP
name: http
selector:
app: openclaw
Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: openclaw-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: openclaw
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
高可用配置
多副本部署
{
"gateway": {
"ha": {
"enabled": true,
"mode": "leader-election", // 或 "shared-state"
"electionIntervalMs": 5000,
"failoverTimeoutMs": 30000
}
}
}
健康检查
{
"health": {
"enabled": true,
"port": 18789,
"endpoints": {
"/health": { "type": "liveness" },
"/ready": { "type": "readiness" },
"/metrics": { "type": "readonly" }
}
}
}
优雅关闭
{
"gateway": {
"gracefulShutdown": {
"enabled": true,
"timeoutMs": 30000,
"drainConnections": true,
"notifyUpstream": true
}
}
}
监控和告警
Prometheus 指标
{
"metrics": {
"enabled": true,
"port": 9090,
"path": "/metrics",
"collectors": [
"requests_total",
"request_duration_seconds",
"active_sessions",
"tokens_used",
"api_cost",
"errors_total"
]
}
}
Grafana 仪表盘
推荐配置以下仪表盘:
- 请求量 - QPS、请求延迟分布
- 会话状态 - 活跃会话数、会话创建/销毁率
- 资源使用 - CPU、内存、网络
- 成本分析 - API调用成本、Token消耗
- 错误率 - 按错误类型分类
告警规则
groups:
- name: openclaw
rules:
- alert: HighErrorRate
expr: rate(openclaw_errors_total[5m]) > 0.1
for: 2m
labels:
severity: critical
annotations:
summary: "OpenClaw错误率过高"
- alert: HighLatency
expr: histogram_quantile(0.95, rate(openclaw_request_duration_seconds_bucket[5m])) > 5
for: 2m
labels:
severity: warning
annotations:
summary: "OpenClaw响应延迟过高"
- alert: SessionLimit
expr: openclaw_active_sessions / openclaw_max_sessions > 0.9
for: 1m
labels:
severity: warning
annotations:
summary: "会话数接近上限"
安全加固
🚨 生产环境必须配置的安全项
- API Token认证
- TLS/SSL加密
- IP白名单
- Rate Limiting
- 审计日志
安全配置示例
{
"security": {
"authentication": {
"type": "token",
"tokens": {
"required": true,
"header": "Authorization",
"prefix": "Bearer"
}
},
"tls": {
"enabled": true,
"certFile": "/path/to/cert.pem",
"keyFile": "/path/to/key.pem"
},
"allowList": ["10.0.0.0/8", "172.16.0.0/12"],
"rateLimit": {
"enabled": true,
"requestsPerMinute": 100,
"burst": 20
},
"auditLog": {
"enabled": true,
"path": "/var/log/openclaw/audit.log"
}
}
}
数据备份
关键数据
- 配置文件 (~/.openclaw/)
- 会话数据 (/data/sessions/)
- 知识库索引 (/data/knowledge/)
- 日志文件 (/var/log/openclaw/)
备份脚本
#!/bin/bash
# backup-openclaw.sh
BACKUP_DIR="/backup/openclaw"
DATE=$(date +%Y%m%d_%H%M%S)
# 创建备份目录
mkdir -p $BACKUP_DIR/$DATE
# 备份配置
tar -czf $BACKUP_DIR/$DATE/config.tar.gz ~/.openclaw/
# 备份数据
tar -czf $BACKUP_DIR/$DATE/data.tar.gz /data/
# 备份日志(可选)
# tar -czf $BACKUP_DIR/$DATE/logs.tar.gz /var/log/openclaw/
# 清理30天前的备份
find $BACKUP_DIR -type d -mtime +30 -exec rm -rf {} \;
echo "Backup completed: $DATE"
定时备份 (Cron)
# 每天凌晨3点执行备份
0 3 * * * /opt/scripts/backup-openclaw.sh >> /var/log/backup.log 2>&1
常见问题排查
Pod启动失败
- 检查镜像是否正确拉取
- 检查配置ConfigMap是否挂载成功
- 检查Secret是否正确创建
- 查看Events:
kubectl describe pod openclaw-xxx
服务无法访问
- 检查Service配置是否正确
- 检查Endpoints:
kubectl get endpoints openclaw - 检查网络策略
- 检查防火墙规则
高延迟/超时
- 检查Pod资源限制是否足够
- 查看Prometheus指标分析瓶颈
- 检查后端服务(API、数据库)响应时间
- 考虑增加副本数
相关资源
🎓 小结
企业级部署要点:
- Docker - 小规模首选,简单快捷
- Docker Compose - 中小规模,多服务编排
- Kubernetes - 大规模生产,高可用自动扩缩容
- 监控告警 - 必须配置,早发现早治疗
- 安全加固 - 生产环境必须做好
- 数据备份 - 定期备份,有备无患
部署做得好,觉才能睡得好。