Kubernetes 部署
使用 Helm Chart 在 Kubernetes 集群中部署 ai-fixer。
重要:ai-fixer 只负责应用本身,PostgreSQL 和 Redis 作为独立的基础设施服务,请提前准备好连接信息。
前置要求
- Kubernetes 1.24+
- Helm 3.0+
- kubectl 已配置
- PostgreSQL 14+ 已部署(外部服务)
- Redis 6.0+ 已部署(外部服务)
快速部署
1. 克隆项目
bash
git clone https://github.com/FLM210/ai-fixer.git
cd ai-fixer/deploy/helm/k8s-fixer2. 创建命名空间
bash
kubectl create namespace ai-fixer3. 创建 Secret
bash
kubectl create secret generic ai-fixer-secrets \
--from-literal=database-url='postgresql+asyncpg://user:password@your-postgres-host:5432/fixer' \
--from-literal=redis-url='redis://your-redis-host:6379/0' \
--from-literal=llm-api-key='sk-xxxxxxxx' \
-n ai-fixer4. 配置 values.yaml
bash
cp values.yaml my-values.yaml编辑 my-values.yaml:
yaml
# 镜像配置
image:
repository: hahtangtang/ai-fixer
tag: latest
# 副本数
replicaCount: 1
# 服务配置
service:
type: ClusterIP
port: 8080
# 环境变量
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: ai-fixer-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: ai-fixer-secrets
key: redis-url
- name: LLM_API_KEY
valueFrom:
secretKeyRef:
name: ai-fixer-secrets
key: llm-api-key
# 配置
config:
LLM_PROVIDER: "anthropic"
LLM_MODEL: "claude-3-5-sonnet-20241022"
LOG_LEVEL: "info"
# 不需要创建 Secret(已手动创建)
secrets:
create: false5. 部署
bash
helm install ai-fixer . -n ai-fixer -f my-values.yaml6. 验证部署
bash
# 查看 Pod 状态
kubectl get pods -n ai-fixer
# 查看服务
kubectl get svc -n ai-fixer
# 查看日志
kubectl logs -f deployment/ai-fixer -n ai-fixer详细配置
values.yaml 完整配置
yaml
# 镜像配置
image:
repository: hahtangtang/ai-fixer
tag: latest
pullPolicy: IfNotPresent
# 副本数
replicaCount: 1
# 服务配置
service:
type: ClusterIP
port: 8080
# Ingress 配置
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: ai-fixer.your-domain.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: ai-fixer-tls
hosts:
- ai-fixer.your-domain.com
# 环境变量
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: ai-fixer-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: ai-fixer-secrets
key: redis-url
- name: LLM_API_KEY
valueFrom:
secretKeyRef:
name: ai-fixer-secrets
key: llm-api-key
# 配置
config:
LLM_PROVIDER: "anthropic"
LLM_MODEL: "claude-3-5-sonnet-20241022"
LOG_LEVEL: "info"
# 资源限制
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2000m
memory: 2Gi
# 健康检查
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
# 自动扩缩容
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 5
targetCPUUtilizationPercentage: 80
# RBAC
rbac:
create: true
# ServiceAccount
serviceAccount:
create: true
name: ai-fixer数据库迁移 Job
Helm Chart 包含数据库迁移 Job(首次部署或升级时运行):
yaml
# templates/migrate-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "ai-fixer.fullname" . }}-migrate
spec:
template:
spec:
containers:
- name: migrate
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
command: ["alembic", "upgrade", "head"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: ai-fixer-secrets
key: database-url
restartPolicy: OnFailure手动运行迁移:
bash
kubectl create job --from=cronjob/ai-fixer-migrate ai-fixer-migrate-manual -n ai-fixer清理 CronJob
定期清理超时的工作流:
yaml
# templates/cleanup-cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: {{ include "ai-fixer.fullname" . }}-cleanup
spec:
schedule: "0 */1 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: cleanup
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
command: ["python", "-m", "app.utils.cleanup"]RBAC 配置
ServiceAccount
yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: ai-fixer
namespace: ai-fixerClusterRole
yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ai-fixer
rules:
- apiGroups: [""]
resources: ["pods", "services", "endpoints", "events"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["deployments", "replicasets"]
verbs: ["get", "list", "watch", "update", "patch"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get"]ClusterRoleBinding
yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ai-fixer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ai-fixer
subjects:
- kind: ServiceAccount
name: ai-fixer
namespace: ai-fixer监控配置
ServiceMonitor
yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: ai-fixer
namespace: ai-fixer
spec:
selector:
matchLabels:
app.kubernetes.io/name: ai-fixer
endpoints:
- port: http
path: /metrics
interval: 30sGrafana Dashboard
导入预置的 Dashboard:
bash
# 获取 Dashboard JSON
cat deploy/grafana/k8s-fixer-overview.json
# 在 Grafana 中导入
# 1. 打开 Grafana
# 2. Dashboards → Import
# 3. 上传 JSON 文件
# 4. 选择 Prometheus 数据源升级
更新配置
bash
helm upgrade ai-fixer . -n ai-fixer -f my-values.yaml更新镜像
bash
helm upgrade ai-fixer . -n ai-fixer \
--set image.tag=v1.1.0运行数据库迁移
升级后可能需要运行数据库迁移:
bash
kubectl create job --from=cronjob/ai-fixer-migrate ai-fixer-migrate-$(date +%s) -n ai-fixer回滚
bash
# 查看历史
helm history ai-fixer -n ai-fixer
# 回滚到指定版本
helm rollback ai-fixer 1 -n ai-fixer卸载
bash
helm uninstall ai-fixer -n ai-fixer
kubectl delete namespace ai-fixer故障排查
Pod 启动失败
bash
# 查看 Pod 详情
kubectl describe pod -l app.kubernetes.io/name=ai-fixer -n ai-fixer
# 查看日志
kubectl logs -l app.kubernetes.io/name=ai-fixer -n ai-fixer
# 查看事件
kubectl get events -n ai-fixer --sort-by='.lastTimestamp'数据库连接失败
bash
# 检查 Secret
kubectl get secret ai-fixer-secrets -n ai-fixer -o yaml
# 测试连接
kubectl exec -it deployment/ai-fixer -n ai-fixer -- python -c "
from app.config.settings import settings
print(settings.DATABASE_URL)
"内存不足
bash
# 查看资源使用
kubectl top pods -n ai-fixer
# 调整资源限制
helm upgrade ai-fixer . -n ai-fixer \
--set resources.limits.memory=4Gi最佳实践
1. 使用命名空间隔离
bash
kubectl create namespace ai-fixer2. 配置资源限制
yaml
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2000m
memory: 2Gi3. 启用自动扩缩容
yaml
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 54. 配置健康检查
yaml
livenessProbe:
httpGet:
path: /healthz
port: 8080
readinessProbe:
httpGet:
path: /healthz
port: 80805. 使用 Secret 管理敏感信息
bash
kubectl create secret generic ai-fixer-secrets \
--from-literal=database-url='postgresql+asyncpg://...' \
--from-literal=redis-url='redis://...' \
--from-literal=llm-api-key='sk-...' \
-n ai-fixer