642 lines
20 KiB
Markdown
642 lines
20 KiB
Markdown
# CI/CD 集成方案(最终落地版)
|
||
|
||
## 概述
|
||
|
||
本文档记录抖音评论管理系统的 CI/CD 方案:**Drone CI + 私有 Docker Registry + Docker Compose 自动部署**。
|
||
|
||
目标:
|
||
|
||
- 每天凌晨 0 点(北京时间)自动拉取 `main` 最新代码并部署
|
||
- 创建 Tag(如 `v1.9.0210.7`)时自动构建并部署
|
||
- 不在每次 `push` 时触发
|
||
- 生产环境继续沿用 `docker-compose.prod.yml`
|
||
|
||
---
|
||
|
||
## 基础设施
|
||
|
||
| 角色 | 位置 | 说明 |
|
||
|------|------|------|
|
||
| Gitea 服务器 | 腾讯云 `git.internal.intelligrow.cn` | 代码仓库 |
|
||
| Drone CI 服务器 | Ubuntu x64 `192.168.31.107` | 执行构建任务 + 运行私有 Docker Registry |
|
||
| 生产服务器 | Ubuntu x64 `192.168.31.48` | 运行 Docker Compose 生产服务,SSH 端口 3141 |
|
||
|
||
---
|
||
|
||
## 架构流程
|
||
|
||
```text
|
||
定时触发(cron)或 创建 tag
|
||
|
|
||
v
|
||
Drone 拉取代码
|
||
|
|
||
+-- 构建 backend 镜像并推送 Registry
|
||
+-- 构建 frontend 镜像并推送 Registry
|
||
v
|
||
SSH 到生产服务器执行 deploy-remote.sh
|
||
|
|
||
+-- pull 新镜像
|
||
+-- 停止 celery_beat
|
||
+-- 更新 backend + celery_worker
|
||
+-- 健康检查(容器内 /health)
|
||
+-- alembic upgrade head(失败即中断)
|
||
+-- 启动 celery_beat + frontend
|
||
+-- 最终健康检查
|
||
v
|
||
发送企业微信通知(成功/失败)
|
||
```
|
||
|
||
---
|
||
|
||
## 文件结构
|
||
|
||
```text
|
||
.drone.yml # Drone CI 流水线
|
||
scripts/
|
||
+-- deploy.sh # 原有手动部署脚本
|
||
+-- deploy-remote.sh # Drone 远程调用的自动部署脚本
|
||
+-- server-setup.sh # 原有服务器初始化脚本
|
||
docker-compose.prod.yml # 生产环境 compose 配置
|
||
docs/
|
||
+-- cicd_integration_updated.md # 本文档
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 1:Drone CI 服务器准备(一次性)
|
||
|
||
### 1.1 Drone CI 服务 (docker-compose.yml)
|
||
|
||
在 Drone CI 服务器 `~/drone/docker-compose.yml`:
|
||
|
||
```yaml
|
||
services:
|
||
drone-server:
|
||
image: drone/drone:2
|
||
container_name: drone-server
|
||
restart: always
|
||
ports:
|
||
- "3080:80"
|
||
environment:
|
||
- DRONE_GITEA_SERVER=https://git.internal.intelligrow.cn
|
||
- DRONE_GITEA_CLIENT_ID=<your-client-id>
|
||
- DRONE_GITEA_CLIENT_SECRET=<your-client-secret>
|
||
- DRONE_SERVER_HOST=drone.internal.intelligrow.cn
|
||
- DRONE_SERVER_PROTO=https
|
||
- DRONE_RPC_SECRET=<your-rpc-secret>
|
||
- DRONE_USER_CREATE=username:<your-gitea-username>,admin:true
|
||
volumes:
|
||
- ./data:/data
|
||
|
||
drone-runner:
|
||
image: drone/drone-runner-docker:1
|
||
container_name: drone-runner
|
||
restart: always
|
||
depends_on:
|
||
- drone-server
|
||
environment:
|
||
- DRONE_RPC_PROTO=http
|
||
- DRONE_RPC_HOST=drone-server
|
||
- DRONE_RPC_SECRET=<your-rpc-secret>
|
||
- DRONE_RUNNER_CAPACITY=2
|
||
- DRONE_RUNNER_NAME=drone-runner-1
|
||
- DRONE_RUNNER_PRIVILEGED_IMAGES=plugins/docker
|
||
volumes:
|
||
- /var/run/docker.sock:/var/run/docker.sock
|
||
```
|
||
|
||
关键配置说明:
|
||
|
||
- **`DRONE_RPC_PROTO=http`**:Runner 在 Docker 内网直连 drone-server 容器的 80 端口,不走 HTTPS。外部通过反向代理 `drone.internal.intelligrow.cn` 访问时才用 HTTPS。
|
||
- **`DRONE_USER_CREATE`**:`username` 必须与 Gitea 登录用户名完全一致(不是邮箱),否则管理员权限不生效。
|
||
- **`DRONE_RUNNER_PRIVILEGED_IMAGES=plugins/docker`**:允许 plugins/docker 以特权模式运行(本方案最终未使用 plugins/docker,但保留配置以备后用)。
|
||
|
||
### 1.2 启动私有 Registry
|
||
|
||
在 Drone CI 服务器执行:
|
||
|
||
```bash
|
||
docker run -d --name registry \
|
||
-p 5000:5000 \
|
||
-v /opt/registry-data:/var/lib/registry \
|
||
--restart always \
|
||
registry:2
|
||
```
|
||
|
||
验证:
|
||
|
||
```bash
|
||
curl http://localhost:5000/v2/_catalog
|
||
# 预期输出: {"repositories":[]}
|
||
```
|
||
|
||
### 1.3 配置 insecure registry
|
||
|
||
**Drone CI 服务器** `/etc/docker/daemon.json`:
|
||
|
||
```json
|
||
{
|
||
"registry-mirrors": [
|
||
"https://docker.1panel.live",
|
||
"https://docker.1panel.dev",
|
||
"https://docker.1ms.run"
|
||
],
|
||
"insecure-registries": [
|
||
"docker.internal.intelligrow.cn:5000",
|
||
"192.168.31.107:5000"
|
||
]
|
||
}
|
||
```
|
||
|
||
**生产服务器** `/etc/docker/daemon.json`:
|
||
|
||
```json
|
||
{
|
||
"registry-mirrors": [
|
||
"https://docker.1panel.live",
|
||
"https://docker.1panel.dev",
|
||
"https://docker.1ms.run"
|
||
],
|
||
"insecure-registries": ["docker.internal.intelligrow.cn:5000"]
|
||
}
|
||
```
|
||
|
||
修改后重启 Docker:
|
||
|
||
```bash
|
||
sudo systemctl restart docker
|
||
```
|
||
|
||
> 注意:`insecure-registries` 中不要带 `http://` 前缀,直接写 `host:port` 格式。
|
||
|
||
### 1.4 配置 SSH 免密
|
||
|
||
在 Drone CI 服务器生成密钥并添加到生产服务器:
|
||
|
||
```bash
|
||
ssh-keygen -t ed25519 -C "drone-ci" -f ~/.ssh/drone_deploy -N ""
|
||
ssh-copy-id -i ~/.ssh/drone_deploy.pub -p 3141 miaosi@192.168.31.48
|
||
```
|
||
|
||
验证:
|
||
|
||
```bash
|
||
ssh -i ~/.ssh/drone_deploy -p 3141 miaosi@192.168.31.48 "echo ok"
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 2:Drone 仓库设置
|
||
|
||
### 2.1 开启 Trusted 模式
|
||
|
||
在 Drone 面板 → 仓库 Settings → General → Project Settings → 勾选 **Trusted**。
|
||
|
||
> 需要管理员权限。如果看不到 Trusted 选项,检查 `DRONE_USER_CREATE` 的 username 是否与 Gitea 用户名一致。
|
||
|
||
### 2.2 配置 Secrets
|
||
|
||
在 Drone 面板(仓库设置 → Secrets)添加:
|
||
|
||
| Secret 名称 | 用途 | 实际值示例 |
|
||
|-------------|------|-----------|
|
||
| `backend_repo` | 后端镜像完整仓库地址 | `docker.internal.intelligrow.cn:5000/douyin-backend` |
|
||
| `frontend_repo` | 前端镜像完整仓库地址 | `docker.internal.intelligrow.cn:5000/douyin-frontend` |
|
||
| `deploy_host` | 生产服务器 IP | `192.168.31.48` |
|
||
| `deploy_user` | SSH 用户 | `miaosi` |
|
||
| `deploy_ssh_key` | SSH 私钥内容 | `-----BEGIN OPENSSH PRIVATE KEY-----...-----END OPENSSH PRIVATE KEY-----` |
|
||
| `deploy_path` | 生产服务器部署目录 | `/opt/docker/douyin_comments_management` |
|
||
| `wecom_webhook` | 企业微信 Webhook(可选) | `https://qyapi.weixin.qq.com/...` |
|
||
|
||
> 注意:`backend_repo` 和 `frontend_repo` 的 Registry 地址必须与生产服务器 `.env` 中的 `DOCKER_REGISTRY` 使用相同的主机名(都用域名或都用 IP),否则 Docker 会认为是不同的镜像。
|
||
|
||
### 2.3 配置 Cron Job
|
||
|
||
在 Drone 面板(仓库设置 → Cron Jobs)添加:
|
||
|
||
| 字段 | 值 |
|
||
|------|------|
|
||
| Name | `nightly-build` |
|
||
| Branch | `main` |
|
||
| Schedule | `0 16 * * *` |
|
||
|
||
说明:
|
||
|
||
- Drone 默认按 UTC 解释 Cron
|
||
- `0 16 * * *` = UTC 16:00 = 北京时间次日 00:00
|
||
|
||
---
|
||
|
||
## Phase 3:配置文件(最终落地版本)
|
||
|
||
### 3.1 `.drone.yml`
|
||
|
||
```yaml
|
||
kind: pipeline
|
||
type: docker
|
||
name: build-and-deploy
|
||
|
||
trigger:
|
||
event:
|
||
- tag
|
||
- cron
|
||
|
||
volumes:
|
||
- name: dockersock
|
||
host:
|
||
path: /var/run/docker.sock
|
||
|
||
steps:
|
||
- name: build-backend
|
||
image: docker:27-cli
|
||
volumes:
|
||
- name: dockersock
|
||
path: /var/run/docker.sock
|
||
environment:
|
||
BACKEND_REPO:
|
||
from_secret: backend_repo
|
||
commands:
|
||
- '[ -n "$BACKEND_REPO" ] || (echo "backend_repo secret is empty" && exit 1)'
|
||
- echo "Building backend image tag:${DRONE_TAG:-latest}"
|
||
- docker build -t "$BACKEND_REPO:${DRONE_TAG:-latest}" -t "$BACKEND_REPO:latest" ./backend
|
||
- docker push "$BACKEND_REPO:${DRONE_TAG:-latest}"
|
||
- docker push "$BACKEND_REPO:latest"
|
||
|
||
- name: build-frontend
|
||
image: docker:27-cli
|
||
volumes:
|
||
- name: dockersock
|
||
path: /var/run/docker.sock
|
||
environment:
|
||
FRONTEND_REPO:
|
||
from_secret: frontend_repo
|
||
commands:
|
||
- '[ -n "$FRONTEND_REPO" ] || (echo "frontend_repo secret is empty" && exit 1)'
|
||
- echo "Building frontend image tag:${DRONE_TAG:-latest}"
|
||
- docker build -t "$FRONTEND_REPO:${DRONE_TAG:-latest}" -t "$FRONTEND_REPO:latest" ./frontend
|
||
- docker push "$FRONTEND_REPO:${DRONE_TAG:-latest}"
|
||
- docker push "$FRONTEND_REPO:latest"
|
||
|
||
- name: deploy
|
||
image: appleboy/drone-ssh
|
||
environment:
|
||
DEPLOY_PATH:
|
||
from_secret: deploy_path
|
||
settings:
|
||
host:
|
||
from_secret: deploy_host
|
||
username:
|
||
from_secret: deploy_user
|
||
key:
|
||
from_secret: deploy_ssh_key
|
||
port: 3141
|
||
command_timeout: 1800s
|
||
script_stop: true
|
||
envs:
|
||
- DRONE_TAG
|
||
- DEPLOY_PATH
|
||
script:
|
||
- IMAGE_TAG="$DRONE_TAG"; [ -n "$IMAGE_TAG" ] || IMAGE_TAG="latest"
|
||
- cd "$DEPLOY_PATH"
|
||
- bash scripts/deploy-remote.sh "$IMAGE_TAG"
|
||
|
||
- name: notify-success
|
||
image: curlimages/curl
|
||
environment:
|
||
WECOM_WEBHOOK:
|
||
from_secret: wecom_webhook
|
||
commands:
|
||
- |
|
||
if [ -n "${WECOM_WEBHOOK:-}" ]; then
|
||
VERSION="$DRONE_TAG"
|
||
[ -n "$VERSION" ] || VERSION="nightly-$(date +%Y%m%d)"
|
||
curl -sS -X POST "$WECOM_WEBHOOK" \
|
||
-H "Content-Type: application/json" \
|
||
-d "{\"msgtype\":\"text\",\"text\":{\"content\":\"✅ 部署成功\\n版本: ${VERSION}\\n仓库: ${DRONE_REPO}\\n时间: $(date '+%Y-%m-%d %H:%M:%S')\"}}"
|
||
fi
|
||
when:
|
||
status:
|
||
- success
|
||
|
||
- name: notify-failure
|
||
image: curlimages/curl
|
||
environment:
|
||
WECOM_WEBHOOK:
|
||
from_secret: wecom_webhook
|
||
commands:
|
||
- |
|
||
if [ -n "${WECOM_WEBHOOK:-}" ]; then
|
||
VERSION="$DRONE_TAG"
|
||
[ -n "$VERSION" ] || VERSION="nightly-$(date +%Y%m%d)"
|
||
curl -sS -X POST "$WECOM_WEBHOOK" \
|
||
-H "Content-Type: application/json" \
|
||
-d "{\"msgtype\":\"text\",\"text\":{\"content\":\"❌ 部署失败\\n版本: ${VERSION}\\n仓库: ${DRONE_REPO}\\n构建: ${DRONE_BUILD_LINK}\"}}"
|
||
fi
|
||
when:
|
||
status:
|
||
- failure
|
||
```
|
||
|
||
### 3.2 `scripts/deploy-remote.sh`
|
||
|
||
```bash
|
||
#!/usr/bin/env bash
|
||
# ========================================
|
||
# 远程部署脚本 - 被 Drone CI SSH 调用
|
||
# ========================================
|
||
# 用法: bash scripts/deploy-remote.sh [image_tag]
|
||
|
||
set -euo pipefail
|
||
|
||
GREEN='\033[0;32m'
|
||
RED='\033[0;31m'
|
||
YELLOW='\033[1;33m'
|
||
NC='\033[0m'
|
||
|
||
log_info() { echo -e "${GREEN}[INFO]${NC} $(date '+%Y-%m-%d %H:%M:%S') $1"; }
|
||
log_warn() { echo -e "${YELLOW}[WARN]${NC} $(date '+%Y-%m-%d %H:%M:%S') $1"; }
|
||
log_error() { echo -e "${RED}[ERROR]${NC} $(date '+%Y-%m-%d %H:%M:%S') $1"; }
|
||
|
||
DEPLOY_PATH="${DEPLOY_PATH:-/opt/docker/douyin_comments_management}"
|
||
COMPOSE_FILE="${COMPOSE_FILE:-docker-compose.prod.yml}"
|
||
IMAGE_TAG="${1:-latest}"
|
||
LOCK_FILE="/tmp/douyin-deploy.lock"
|
||
MAX_ATTEMPTS="${MAX_ATTEMPTS:-30}"
|
||
|
||
cleanup_lock() {
|
||
rm -f "$LOCK_FILE"
|
||
}
|
||
|
||
acquire_lock() {
|
||
if [ -f "$LOCK_FILE" ]; then
|
||
local old_pid
|
||
old_pid=$(cat "$LOCK_FILE" 2>/dev/null || true)
|
||
if [ -n "$old_pid" ] && kill -0 "$old_pid" 2>/dev/null; then
|
||
log_error "已有部署进程运行中 (PID: $old_pid)"
|
||
exit 1
|
||
fi
|
||
rm -f "$LOCK_FILE"
|
||
fi
|
||
|
||
echo "$$" > "$LOCK_FILE"
|
||
trap cleanup_lock EXIT
|
||
}
|
||
|
||
compose() {
|
||
docker compose -f "$COMPOSE_FILE" "$@"
|
||
}
|
||
|
||
wait_backend_healthy() {
|
||
local attempt=1
|
||
|
||
while [ "$attempt" -le "$MAX_ATTEMPTS" ]; do
|
||
if compose exec -T backend python -c "import sys,urllib.request;urllib.request.urlopen('http://127.0.0.1:8000/health', timeout=3);sys.exit(0)" >/dev/null 2>&1; then
|
||
log_info "后端健康检查通过"
|
||
return 0
|
||
fi
|
||
|
||
log_info "等待后端就绪... (${attempt}/${MAX_ATTEMPTS})"
|
||
sleep 3
|
||
attempt=$((attempt + 1))
|
||
done
|
||
|
||
return 1
|
||
}
|
||
|
||
main() {
|
||
acquire_lock
|
||
|
||
[ -d "$DEPLOY_PATH" ] || { log_error "部署目录不存在: $DEPLOY_PATH"; exit 1; }
|
||
cd "$DEPLOY_PATH"
|
||
|
||
[ -f "$COMPOSE_FILE" ] || { log_error "Compose 文件不存在: $COMPOSE_FILE"; exit 1; }
|
||
|
||
export IMAGE_TAG
|
||
export VERSION="$IMAGE_TAG"
|
||
|
||
log_info "开始部署,镜像版本: $IMAGE_TAG"
|
||
|
||
log_info "拉取最新镜像..."
|
||
compose pull backend celery_worker celery_beat frontend
|
||
|
||
log_info "停止 celery_beat..."
|
||
compose stop celery_beat || true
|
||
|
||
log_info "更新 backend 与 celery_worker..."
|
||
compose up -d --no-deps backend celery_worker
|
||
|
||
if ! wait_backend_healthy; then
|
||
log_error "后端未在预期时间内就绪"
|
||
compose logs --tail=200 backend || true
|
||
exit 1
|
||
fi
|
||
|
||
log_info "执行数据库迁移..."
|
||
compose exec -T backend alembic upgrade head
|
||
|
||
log_info "启动 celery_beat..."
|
||
compose up -d --no-deps celery_beat
|
||
|
||
log_info "更新 frontend..."
|
||
compose up -d --no-deps frontend
|
||
|
||
log_info "清理孤儿容器..."
|
||
compose up -d --remove-orphans
|
||
|
||
if ! wait_backend_healthy; then
|
||
log_error "部署完成后健康检查失败"
|
||
compose logs --tail=200 backend || true
|
||
exit 1
|
||
fi
|
||
|
||
docker image prune -f >/dev/null 2>&1 || log_warn "镜像清理失败,已跳过"
|
||
|
||
log_info "部署完成!版本: $IMAGE_TAG"
|
||
compose ps
|
||
}
|
||
|
||
main "$@"
|
||
```
|
||
|
||
> 注意:脚本同时 export `IMAGE_TAG` 和 `VERSION`,因为生产服务器的 `docker-compose.prod.yml` 使用 `${VERSION:-latest}` 作为镜像 tag 变量。
|
||
|
||
---
|
||
|
||
## 生产服务器 `.env` 配置
|
||
|
||
确保 `/opt/docker/douyin_comments_management/.env` 至少包含:
|
||
|
||
```bash
|
||
DOCKER_REGISTRY=docker.internal.intelligrow.cn:5000
|
||
|
||
POSTGRES_PASSWORD=xxx
|
||
REDIS_PASSWORD=xxx
|
||
RABBITMQ_PASSWORD=xxx
|
||
# 其余生产配置保持现有值
|
||
```
|
||
|
||
> `DOCKER_REGISTRY` 的值必须与 Drone Secret 中 `backend_repo` / `frontend_repo` 的 Registry 主机名一致。
|
||
|
||
---
|
||
|
||
## 回滚方案
|
||
|
||
### 方式 1:推送旧版本 tag 触发重新部署
|
||
|
||
```bash
|
||
git tag v1.8.1-rollback v1.8.1
|
||
git push origin v1.8.1-rollback
|
||
```
|
||
|
||
### 方式 2:生产服务器手动回滚
|
||
|
||
```bash
|
||
ssh -p 3141 miaosi@192.168.31.48
|
||
cd /opt/docker/douyin_comments_management
|
||
bash scripts/deploy-remote.sh v1.8.1
|
||
```
|
||
|
||
### 方式 3:手动 compose 回滚
|
||
|
||
```bash
|
||
cd /opt/docker/douyin_comments_management
|
||
export VERSION=v1.8.1
|
||
docker compose -f docker-compose.prod.yml pull backend celery_worker celery_beat frontend
|
||
docker compose -f docker-compose.prod.yml up -d --no-deps backend celery_worker celery_beat frontend
|
||
```
|
||
|
||
---
|
||
|
||
## 验证方式
|
||
|
||
### 验证 Registry
|
||
|
||
```bash
|
||
# Drone CI 服务器
|
||
curl http://localhost:5000/v2/_catalog
|
||
curl http://localhost:5000/v2/douyin-backend/tags/list
|
||
curl http://localhost:5000/v2/douyin-frontend/tags/list
|
||
```
|
||
|
||
### 验证构建与部署触发
|
||
|
||
```bash
|
||
# 方式1:Tag 触发
|
||
git tag v1.9.0210.8
|
||
git push origin v1.9.0210.8
|
||
|
||
# 方式2:Drone 面板手动触发 cron 或点击 NEW BUILD
|
||
```
|
||
|
||
### 验证生产服务
|
||
|
||
```bash
|
||
# 检查容器状态
|
||
ssh -p 3141 miaosi@192.168.31.48 "cd /opt/docker/douyin_comments_management && docker compose -f docker-compose.prod.yml ps"
|
||
|
||
# 后端健康检查(容器内)
|
||
ssh -p 3141 miaosi@192.168.31.48 "cd /opt/docker/douyin_comments_management && docker compose -f docker-compose.prod.yml exec -T backend python -c \"import urllib.request;urllib.request.urlopen('http://127.0.0.1:8000/health', timeout=3);print('ok')\""
|
||
|
||
# 数据库迁移状态
|
||
ssh -p 3141 miaosi@192.168.31.48 "cd /opt/docker/douyin_comments_management && docker compose -f docker-compose.prod.yml exec -T backend alembic current"
|
||
```
|
||
|
||
---
|
||
|
||
## 故障排查
|
||
|
||
| 问题 | 排查方式 |
|
||
|------|---------|
|
||
| YAML 解析错误 | 检查 `.drone.yml` 语法,Drone 对 `volumes`/`environment` 格式敏感 |
|
||
| 构建失败 | Drone 面板查看 pipeline 日志 |
|
||
| 镜像推送失败 (HTTPS) | 确认两台服务器 `insecure-registries` 配置正确(不带 `http://` 前缀) |
|
||
| Secret 为空 | 使用 `environment: { VAR: { from_secret: name } }` 而非 `secrets` 字段 |
|
||
| Drone 变量替换冲突 | `${DRONE_TAG}` 是 Drone 变量可直接使用;自定义 shell 变量用 `$$VAR` 转义 |
|
||
| Tag 不触发构建 | 检查 Gitea Webhook 是否勾选"创建"事件;`.drone.yml` trigger 不要加 `cron: [name]` |
|
||
| Step is pending | 检查 Runner 是否连通 Server;仓库是否开启 Trusted |
|
||
| DinD 启动失败 | 改用 `docker:27-cli` + 挂载宿主机 Docker socket(本方案采用的方式) |
|
||
| deploy 找不到脚本 | 确认 `deploy-remote.sh` 已复制到生产服务器部署目录的 `scripts/` 下 |
|
||
| Docker 权限不足 | 生产服务器执行 `sudo usermod -aG docker <user>` 后重新登录 |
|
||
| 镜像名 invalid reference | 检查生产服务器 `.env` 中 `DOCKER_REGISTRY` 变量是否正确设置 |
|
||
| 数据库迁移失败 | `docker compose -f docker-compose.prod.yml logs -f backend` |
|
||
| Cron 未触发 | 核对 Drone Cron 名称、分支是否 `main`、Schedule 是否正确 |
|
||
|
||
---
|
||
|
||
## 踩坑记录
|
||
|
||
以下是实际部署过程中遇到的问题及解决方案,供后续参考:
|
||
|
||
### 1. Drone YAML 解析错误 (`cannot unmarshal !!map into string`)
|
||
|
||
**原因**:Drone Docker pipeline 对 `environment` 中 `from_secret` 语法和 `volumes` 格式有特定要求。早期版本同时使用 `volumes` + `environment: from_secret` 会触发解析错误。
|
||
|
||
**解决**:确保仓库开启 Trusted 模式后,`volumes` 和 `environment: from_secret` 可以正常共存。
|
||
|
||
### 2. Tag 推送不触发构建
|
||
|
||
**原因**:`.drone.yml` 中同时配置了 `event: [tag, cron]` 和 `cron: [nightly-build]`,Drone 将触发条件做 AND 运算。Tag 事件无法满足 cron 条件,导致永远不触发。
|
||
|
||
**解决**:移除 `cron: [nightly-build]` 过滤,只保留 `event: [tag, cron]`。
|
||
|
||
### 3. plugins/docker DinD 启动失败
|
||
|
||
**原因**:`plugins/docker` 插件内部启动 Docker 守护进程(Docker-in-Docker),可能因 cgroup/存储驱动兼容性问题无法启动。
|
||
|
||
**解决**:放弃 `plugins/docker`,改用 `docker:27-cli` 镜像 + 挂载宿主机 Docker socket 的方式构建镜像。
|
||
|
||
### 4. plugins/docker 要求 semver 格式 tag
|
||
|
||
**原因**:`auto_tag: true` 配置要求 Git tag 符合语义化版本(如 `v1.0.0`),非标准格式(如 `v1.9.0210.1`)会解析失败。
|
||
|
||
**解决**:移除 `auto_tag`,改用 `${DRONE_TAG:-latest}` 手动指定镜像 tag。
|
||
|
||
### 5. Drone 变量替换与 Shell 变量冲突
|
||
|
||
**原因**:Drone 会在执行前对 `${VAR}` 语法做自身的变量替换。自定义 shell 变量(如 `TAG="xxx"; echo ${TAG}`)中的 `${TAG}` 会被 Drone 替换为空。
|
||
|
||
**解决**:直接使用 Drone 内置变量 `${DRONE_TAG:-latest}`,避免中间 shell 变量。
|
||
|
||
### 6. `secrets` 字段注入环境变量不生效
|
||
|
||
**原因**:Drone 步骤级 `secrets:` 字段在某些场景下不会将 secret 注入为环境变量。
|
||
|
||
**解决**:改用 `environment: { VAR: { from_secret: name } }` 显式声明。
|
||
|
||
### 7. Runner 连接 Server 失败
|
||
|
||
**原因**:Runner 配置 `DRONE_RPC_PROTO=https` 但 drone-server 容器内部只监听 80 端口(HTTPS 由外部反向代理终止)。
|
||
|
||
**解决**:Runner 通过外部域名 `drone.internal.intelligrow.cn` 连接(走反向代理的 HTTPS),而非直连容器内网。
|
||
|
||
### 8. 管理员权限不生效(看不到 Trusted 选项)
|
||
|
||
**原因**:`DRONE_USER_CREATE=username:zhanghuayu@intelligrow.ai,admin:true` 中的 username 使用了邮箱而非 Gitea 登录用户名。
|
||
|
||
**解决**:改为 `username:zhanghuayu,admin:true`(与 Gitea 用户名完全一致)。
|
||
|
||
### 9. 生产服务器镜像名 invalid reference
|
||
|
||
**原因**:`docker-compose.prod.yml` 使用 `${DOCKER_REGISTRY}` 变量,但 `.env` 中未设置或变量名不匹配(曾误设为 `REGISTRY_HOST`)。
|
||
|
||
**解决**:在 `.env` 中添加 `DOCKER_REGISTRY=docker.internal.intelligrow.cn:5000`,确保变量名与 compose 文件一致。
|
||
|
||
### 10. insecure-registries 格式错误
|
||
|
||
**原因**:`/etc/docker/daemon.json` 中 `insecure-registries` 配置了 `http://docker.internal.intelligrow.cn:5000`(带协议前缀),Docker 不识别。
|
||
|
||
**解决**:去掉 `http://` 前缀,直接写 `docker.internal.intelligrow.cn:5000`。
|
||
|
||
---
|
||
|
||
## 注意事项
|
||
|
||
1. **Registry 无认证**:仅建议内网使用;公网请加 TLS + 认证(或使用 Harbor)。
|
||
2. **Docker Socket 挂载**:构建步骤挂载宿主机 Docker socket,需仓库开启 Trusted 模式。
|
||
3. **并发部署保护**:`deploy-remote.sh` 使用 `/tmp/douyin-deploy.lock` 防止并发部署。
|
||
4. **迁移失败即中断**:`alembic upgrade head` 失败会使整个部署失败,防止"假成功"。
|
||
5. **健康检查不依赖 curl**:采用容器内 Python 请求 `/health`,与当前镜像一致。
|
||
6. **deploy-remote.sh 需手动同步**:该脚本存放在生产服务器上,代码更新后需手动复制或通过部署流程同步。
|
||
7. **VERSION 与 IMAGE_TAG**:`deploy-remote.sh` 同时 export 两个变量,兼容不同 compose 文件的命名。
|