OpenClaw企业级部署完全指南:Docker容器化、Kubernetes集群、高可用配置、监控告警、安全加固,生产环境必备!">

🏭 OpenClaw 企业级部署完全指南

凌晨4点02分,你被电话吵醒:"生产环境又挂了"...

这篇文章教你用正确的方式部署OpenClaw,让你的AI Agent在生产环境中稳如老狗。

部署架构概览

方案 复杂度 适用规模 特点
Docker 小规模 简单易用,快速部署
Docker Compose ⭐⭐ 中小规模 多服务编排
Kubernetes ⭐⭐⭐⭐⭐ 大规模/生产 高可用,自动扩缩容

Docker 部署

获取镜像

# 从Docker Hub拉取
docker pull openclaw/openclaw:latest

# 或使用特定版本
docker pull openclaw/openclaw:2.4.0

基础运行

docker run -d \
  --name openclaw \
  -p 18789:18789 \
  -v ~/.openclaw:/root/.openclaw \
  -e ANTHROPIC_API_KEY=your-api-key \
  openclaw/openclaw:latest

生产级配置

docker run -d \
  --name openclaw \
  --restart unless-stopped \
  -p 18789:18789 \
  -p 18788:18788 \
  -v ~/.openclaw:/root/.openclaw:ro \
  -v openclaw-logs:/var/log/openclaw \
  -v openclaw-data:/data \
  -e ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} \
  -e OPENCLAW_CONFIG=/config/openclaw.json \
  -e LOG_LEVEL=info \
  -e NODE_ENV=production \
  --health-cmd="curl -f http://localhost:18789/health || exit 1" \
  --health-interval=30s \
  --health-timeout=10s \
  --health-retries=3 \
  openclaw/openclaw:latest

Docker Compose 部署

docker-compose.yml

version: '3.8'

services:
  openclaw:
    image: openclaw/openclaw:latest
    container_name: openclaw
    restart: unless-stopped
    ports:
      - "18789:18789"
      - "18788:18788"
    volumes:
      - ./config:/config:ro
      - ./data:/data
      - ./logs:/var/log/openclaw
    environment:
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - OPENCLAW_CONFIG=/config/openclaw.json
      - LOG_LEVEL=info
      - NODE_ENV=production
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:18789/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s
    networks:
      - openclaw-network

  # 可选:日志聚合
  loki:
    image: grafana/loki:latest
    container_name: openclaw-loki
    ports:
      - "3100:3100"
    volumes:
      - ./loki-config.yml:/etc/loki/local-config.yaml:ro
      - loki-data:/loki
    networks:
      - openclaw-network

  # 可选:监控
  prometheus:
    image: prom/prometheus:latest
    container_name: openclaw-prometheus
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - prometheus-data:/prometheus
    networks:
      - openclaw-network

networks:
  openclaw-network:
    driver: bridge

volumes:
  loki-data:
  prometheus-data:
# 启动所有服务
docker-compose up -d

# 查看日志
docker-compose logs -f openclaw

# 停止所有服务
docker-compose down

Kubernetes 部署

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: openclaw
  labels:
    app: openclaw
spec:
  replicas: 3
  selector:
    matchLabels:
      app: openclaw
  template:
    metadata:
      labels:
        app: openclaw
    spec:
      containers:
      - name: openclaw
        image: openclaw/openclaw:latest
        ports:
        - containerPort: 18789
          name: http
        - containerPort: 18788
          name: ws
        env:
        - name: ANTHROPIC_API_KEY
          valueFrom:
            secretKeyRef:
              name: openclaw-secrets
              key: api-key
        - name: OPENCLAW_CONFIG
          value: "/config/openclaw.json"
        - name: NODE_ENV
          value: "production"
        volumeMounts:
        - name: config
          mountPath: /config
          readOnly: true
        - name: data
          mountPath: /data
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "2Gi"
            cpu: "2000m"
        livenessProbe:
          httpGet:
            path: /health
            port: 18789
          initialDelaySeconds: 60
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /ready
            port: 18789
          initialDelaySeconds: 30
          periodSeconds: 10
      volumes:
      - name: config
        configMap:
          name: openclaw-config
      - name: data
        persistentVolumeClaim:
          claimName: openclaw-data

Service

apiVersion: v1
kind: Service
metadata:
  name: openclaw
spec:
  type: ClusterIP
  ports:
  - port: 80
    targetPort: 18789
    protocol: TCP
    name: http
  selector:
    app: openclaw
---
apiVersion: v1
kind: Service
metadata:
  name: openclaw-lb
spec:
  type: LoadBalancer
  ports:
  - port: 18789
    targetPort: 18789
    protocol: TCP
    name: http
  selector:
    app: openclaw

Horizontal Pod Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: openclaw-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: openclaw
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

高可用配置

多副本部署

{
  "gateway": {
    "ha": {
      "enabled": true,
      "mode": "leader-election",  // 或 "shared-state"
      "electionIntervalMs": 5000,
      "failoverTimeoutMs": 30000
    }
  }
}

健康检查

{
  "health": {
    "enabled": true,
    "port": 18789,
    "endpoints": {
      "/health": { "type": "liveness" },
      "/ready": { "type": "readiness" },
      "/metrics": { "type": "readonly" }
    }
  }
}

优雅关闭

{
  "gateway": {
    "gracefulShutdown": {
      "enabled": true,
      "timeoutMs": 30000,
      "drainConnections": true,
      "notifyUpstream": true
    }
  }
}

监控和告警

Prometheus 指标

{
  "metrics": {
    "enabled": true,
    "port": 9090,
    "path": "/metrics",
    "collectors": [
      "requests_total",
      "request_duration_seconds",
      "active_sessions",
      "tokens_used",
      "api_cost",
      "errors_total"
    ]
  }
}

Grafana 仪表盘

推荐配置以下仪表盘:

  • 请求量 - QPS、请求延迟分布
  • 会话状态 - 活跃会话数、会话创建/销毁率
  • 资源使用 - CPU、内存、网络
  • 成本分析 - API调用成本、Token消耗
  • 错误率 - 按错误类型分类

告警规则

groups:
- name: openclaw
  rules:
  - alert: HighErrorRate
    expr: rate(openclaw_errors_total[5m]) > 0.1
    for: 2m
    labels:
      severity: critical
    annotations:
      summary: "OpenClaw错误率过高"
      
  - alert: HighLatency
    expr: histogram_quantile(0.95, rate(openclaw_request_duration_seconds_bucket[5m])) > 5
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "OpenClaw响应延迟过高"
      
  - alert: SessionLimit
    expr: openclaw_active_sessions / openclaw_max_sessions > 0.9
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "会话数接近上限"

安全加固

🚨 生产环境必须配置的安全项
  • API Token认证
  • TLS/SSL加密
  • IP白名单
  • Rate Limiting
  • 审计日志

安全配置示例

{
  "security": {
    "authentication": {
      "type": "token",
      "tokens": {
        "required": true,
        "header": "Authorization",
        "prefix": "Bearer"
      }
    },
    "tls": {
      "enabled": true,
      "certFile": "/path/to/cert.pem",
      "keyFile": "/path/to/key.pem"
    },
    "allowList": ["10.0.0.0/8", "172.16.0.0/12"],
    "rateLimit": {
      "enabled": true,
      "requestsPerMinute": 100,
      "burst": 20
    },
    "auditLog": {
      "enabled": true,
      "path": "/var/log/openclaw/audit.log"
    }
  }
}

数据备份

关键数据

备份脚本

#!/bin/bash
# backup-openclaw.sh

BACKUP_DIR="/backup/openclaw"
DATE=$(date +%Y%m%d_%H%M%S)

# 创建备份目录
mkdir -p $BACKUP_DIR/$DATE

# 备份配置
tar -czf $BACKUP_DIR/$DATE/config.tar.gz ~/.openclaw/

# 备份数据
tar -czf $BACKUP_DIR/$DATE/data.tar.gz /data/

# 备份日志(可选)
# tar -czf $BACKUP_DIR/$DATE/logs.tar.gz /var/log/openclaw/

# 清理30天前的备份
find $BACKUP_DIR -type d -mtime +30 -exec rm -rf {} \;

echo "Backup completed: $DATE"

定时备份 (Cron)

# 每天凌晨3点执行备份
0 3 * * * /opt/scripts/backup-openclaw.sh >> /var/log/backup.log 2>&1

常见问题排查

Pod启动失败

服务无法访问

高延迟/超时

🎓 小结

企业级部署要点:

  • Docker - 小规模首选,简单快捷
  • Docker Compose - 中小规模,多服务编排
  • Kubernetes - 大规模生产,高可用自动扩缩容
  • 监控告警 - 必须配置,早发现早治疗
  • 安全加固 - 生产环境必须做好
  • 数据备份 - 定期备份,有备无患

部署做得好,觉才能睡得好。