Your Name 9cdb99505c 新增 TDD 实施评估与计划文档

featuredoc/tdd_plan.md (V1.0):
- 项目现状诊断：零代码起步，高度适合TDD
- 测试金字塔架构：单元75% + 集成20% + E2E 5%
- 后端测试策略：pytest + TestContainers + 表格驱动测试
- 前端测试策略：Vitest + Testing Library + MSW + Playwright
- AI模型测试策略：标注集验证 + 阈值门禁 + 回归测试
- 11周实施路线图
- 覆盖率目标：后端80%、前端70%、AI模块70%
- 工具链配置与CI/CD集成
- 团队规范与培训计划

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-02 16:58:00 +08:00

78 KiB

Raw Blame History

TDD 实施评估与计划

文档类型	Testing Strategy (测试驱动开发实施计划)
项目名称	SmartAudit (AI 营销内容合规审核平台)
版本号	V1.0
发布日期	2026-02-02
关联文档	tasks.md, DevelopmentPlan.md, FeatureSummary.md

版本历史 (Version History)

版本	日期	作者	变更说明
V1.0	2026-02-02	Claude	初稿：项目诊断、TDD可行性评估、实施计划

1. 项目现状诊断

1.1 代码库状态

维度	当前状态	评估
源代码	零代码，纯需求阶段	✅ 最佳TDD切入点
文档完整度	5,796行，覆盖PRD/RD/技术架构/UI	✅ 需求明确
技术选型	已确定：FastAPI + Next.js	✅ 测试生态成熟
任务拆解	77个开发任务，优先级明确	✅ 粒度适合TDD
验收标准	每个功能有量化指标	✅ 可直接转化为测试用例
CI/CD	已规划，待实施	⚠️ 需同步搭建

1.2 技术栈测试生态评估

技术	测试框架支持	生态成熟度	Mock/Stub 支持
FastAPI	pytest + httpx	⭐⭐⭐⭐⭐	TestClient 内置
Celery	pytest-celery	⭐⭐⭐⭐	eager mode 支持
PostgreSQL	TestContainers	⭐⭐⭐⭐⭐	容器化隔离
Next.js/React	Vitest + RTL	⭐⭐⭐⭐⭐	MSW 拦截
Zustand	原生测试支持	⭐⭐⭐⭐	无需特殊处理
Socket.io	jest-socket.io-mock	⭐⭐⭐	需手动 Mock

1.3 项目复杂度分析

复杂度热力图：

┌─────────────────────────────────────────────────────────┐
│ 模块                        │ 业务复杂度 │ 测试难度  │
├─────────────────────────────────────────────────────────┤
│ 认证与权限 (RBAC)           │    ██░░░   │   ██░░░   │
│ Brief 解析 (LLM)            │    ████░   │   ████░   │
│ 规则引擎                    │    ███░░   │   ██░░░   │
│ 视频上传 (Tus)              │    ██░░░   │   ███░░   │
│ ASR/OCR/CV 流水线           │    █████   │   █████   │  ← 最高
│ 多模态时间戳对齐            │    █████   │   █████   │  ← 最高
│ WebSocket 进度推送          │    ██░░░   │   ███░░   │
│ 审核决策流程                │    ███░░   │   ██░░░   │
│ 数据看板                    │    ██░░░   │   ██░░░   │
│ 移动端 H5                   │    ██░░░   │   ███░░   │
└─────────────────────────────────────────────────────────┘

诊断结论：

✅ 绿灯项目：零代码起步，是实施TDD的理想时机
✅ 技术栈测试生态成熟，无明显阻碍
⚠️ AI流水线（ASR/OCR/CV）测试需特殊策略

2. TDD 可行性评估

2.1 综合评分

评估维度	评分	说明
需求明确性	⭐⭐⭐⭐⭐	PRD/RD 详尽，用户故事完整
功能粒度	⭐⭐⭐⭐⭐	77个任务，边界清晰
技术可测性	⭐⭐⭐⭐	主流框架，生态成熟
团队规模	⭐⭐⭐⭐	8人精干团队，沟通高效
时间充裕度	⭐⭐⭐⭐	11周排期，非极限压缩
验收标准量化	⭐⭐⭐⭐⭐	每个功能有明确KPI

总体评估：🟢 高度可行 (95分/100)

2.2 TDD 适用性分析

模块类型	TDD 适用度	推荐策略
纯业务逻辑	⭐⭐⭐⭐⭐	严格 TDD（先写测试）
API 接口	⭐⭐⭐⭐⭐	契约测试 + TDD
数据模型	⭐⭐⭐⭐	TDD + Schema 验证
规则引擎	⭐⭐⭐⭐⭐	表格驱动测试 + TDD
AI 模型调用	⭐⭐⭐	混合模式（输入输出验证）
AI Prompt	⭐⭐	标注测试集验证
UI 组件	⭐⭐⭐⭐	组件级 TDD
E2E 流程	⭐⭐	BDD + E2E 测试

2.3 TDD 实施模式选择

推荐采用 "分层混合 TDD" 模式：

┌──────────────────────────────────────────────────────────────┐
│                    分层混合 TDD 模式                          │
├──────────────────────────────────────────────────────────────┤
│                                                              │
│   ┌─────────────────────────────────────────────────────┐   │
│   │  第1层：严格 TDD (100% 覆盖)                         │   │
│   │  • 工具函数 (utils)                                  │   │
│   │  • 数据验证器 (validators)                           │   │
│   │  • 规则引擎 (rule engine)                            │   │
│   │  • 业务逻辑服务 (services)                           │   │
│   └─────────────────────────────────────────────────────┘   │
│                           ↓                                  │
│   ┌─────────────────────────────────────────────────────┐   │
│   │  第2层：契约优先 (Contract-First)                    │   │
│   │  • API 接口 → 先定义 OpenAPI                         │   │
│   │  • 数据模型 → 先定义 Schema                          │   │
│   │  • WebSocket 消息 → 先定义消息格式                   │   │
│   └─────────────────────────────────────────────────────┘   │
│                           ↓                                  │
│   ┌─────────────────────────────────────────────────────┐   │
│   │  第3层：标注集验证 (AI 模型)                         │   │
│   │  • ASR/OCR/CV → 标注测试集 + 阈值验证                │   │
│   │  • LLM Prompt → Few-shot 示例 + 定期回归            │   │
│   │  • 向量检索 → 召回率/精确率评估                      │   │
│   └─────────────────────────────────────────────────────┘   │
│                           ↓                                  │
│   ┌─────────────────────────────────────────────────────┐   │
│   │  第4层：行为驱动 (BDD + E2E)                         │   │
│   │  • 用户故事 → Playwright E2E                         │   │
│   │  • 关键路径 → 冒烟测试                               │   │
│   └─────────────────────────────────────────────────────┘   │
│                                                              │
└──────────────────────────────────────────────────────────────┘

3. 测试金字塔架构

3.1 测试层级分布

                        ┌─────────────┐
                        │   E2E 测试   │  5%
                        │  (Playwright)│
                        └──────┬──────┘
                               │
                    ┌──────────┴──────────┐
                    │     集成测试        │  20%
                    │  (API + DB + 外部)   │
                    └──────────┬──────────┘
                               │
            ┌──────────────────┴──────────────────┐
            │            单元测试                  │  75%
            │   (函数、类、组件、纯逻辑)            │
            └─────────────────────────────────────┘

3.2 各层级职责划分

层级	占比	覆盖范围	执行频率	执行时间
单元测试	75%	函数/类/组件/纯逻辑	每次提交	< 30秒
集成测试	20%	API/DB/消息队列/外部服务	每次PR	< 5分钟
E2E 测试	5%	完整用户流程	每日/发布前	< 15分钟

3.3 SmartAudit 测试分层详情

┌────────────────────────────────────────────────────────────────────┐
│                      SmartAudit 测试分层                            │
├────────────────────────────────────────────────────────────────────┤
│                                                                    │
│  【E2E 测试层】 Playwright                                         │
│  ├─ 达人流程：上传视频 → 等待审核 → 查看结果 → 申诉                │
│  ├─ 代理商流程：配置Brief → 审核视频 → 驳回/通过                   │
│  └─ 品牌方流程：查看看板 → 配置规则 → 审批强制通过                 │
│                                                                    │
│  【集成测试层】 pytest + TestContainers                             │
│  ├─ API 接口测试 (httpx TestClient)                                │
│  ├─ 数据库集成测试 (PostgreSQL + pgvector)                         │
│  ├─ Redis 缓存测试                                                 │
│  ├─ Celery 任务测试 (eager mode)                                   │
│  ├─ 文件上传测试 (OSS Mock)                                        │
│  └─ WebSocket 推送测试                                             │
│                                                                    │
│  【单元测试层】                                                     │
│  │                                                                 │
│  │  后端 (pytest)                       前端 (Vitest)              │
│  │  ├─ 工具函数                         ├─ 工具函数                │
│  │  ├─ 数据验证器                       ├─ 格式化函数              │
│  │  ├─ 规则引擎逻辑                     ├─ 状态管理 (Zustand)      │
│  │  ├─ 时间戳对齐算法                   ├─ React Hooks            │
│  │  ├─ Brief 解析逻辑                   ├─ UI 组件                 │
│  │  ├─ 业务服务方法                     └─ 表单验证逻辑            │
│  │  └─ Pydantic 模型                                               │
│  │                                                                 │
│  │  AI 模型 (标注测试集)                                           │
│  │  ├─ ASR 输出格式验证                                            │
│  │  ├─ OCR 输出格式验证                                            │
│  │  ├─ CV 检测结果验证                                             │
│  │  ├─ LLM 输出解析验证                                            │
│  │  └─ 向量相似度计算验证                                          │
│  │                                                                 │
└────────────────────────────────────────────────────────────────────┘

4. 后端测试策略

4.1 测试框架选型

用途	工具	说明
测试框架	pytest	Python 标准，插件生态丰富
异步测试	pytest-asyncio	FastAPI 异步支持
覆盖率	pytest-cov	覆盖率报告
Mock	unittest.mock / pytest-mock	依赖模拟
Fixture	pytest fixtures	测试数据管理
参数化	@pytest.mark.parametrize	表格驱动测试
容器化测试	TestContainers	DB/Redis/MQ 隔离
API 测试	httpx + TestClient	FastAPI 内置
Celery 测试	celery.contrib.testing	任务测试
快照测试	syrupy	JSON 输出验证

4.2 目录结构

backend/
├── tests/
│   ├── __init__.py
│   ├── conftest.py              # 全局 fixtures
│   │
│   ├── unit/                    # 单元测试 (75%)
│   │   ├── __init__.py
│   │   ├── test_validators.py   # 数据验证器
│   │   ├── test_utils.py        # 工具函数
│   │   ├── test_rule_engine.py  # 规则引擎
│   │   ├── test_timestamp.py    # 时间戳对齐
│   │   ├── test_brief_parser.py # Brief 解析逻辑
│   │   └── services/
│   │       ├── test_auth.py
│   │       ├── test_brief.py
│   │       ├── test_video.py
│   │       └── test_report.py
│   │
│   ├── integration/             # 集成测试 (20%)
│   │   ├── __init__.py
│   │   ├── conftest.py          # DB/Redis fixtures
│   │   ├── test_api_auth.py     # 认证 API
│   │   ├── test_api_brief.py    # Brief API
│   │   ├── test_api_video.py    # 视频 API
│   │   ├── test_api_report.py   # 报告 API
│   │   ├── test_db_models.py    # 数据库模型
│   │   ├── test_celery_tasks.py # 异步任务
│   │   └── test_websocket.py    # WebSocket
│   │
│   ├── ai/                      # AI 模型测试
│   │   ├── __init__.py
│   │   ├── conftest.py          # 测试集加载
│   │   ├── test_asr.py          # ASR 输出验证
│   │   ├── test_ocr.py          # OCR 输出验证
│   │   ├── test_cv.py           # CV 检测验证
│   │   ├── test_llm.py          # LLM 输出解析
│   │   └── test_embedding.py    # 向量生成验证
│   │
│   ├── e2e/                     # 端到端测试 (5%)
│   │   ├── __init__.py
│   │   └── test_workflows.py    # 完整流程
│   │
│   └── fixtures/                # 测试数据
│       ├── briefs/              # 测试 Brief 文件
│       ├── videos/              # 测试视频文件
│       ├── rules/               # 测试规则集
│       └── snapshots/           # 快照数据
│
├── pytest.ini                   # pytest 配置
└── pyproject.toml               # 项目配置

4.3 核心测试用例设计

4.3.1 规则引擎测试 (表格驱动)

# tests/unit/test_rule_engine.py

import pytest
from app.services.rule_engine import RuleEngine

class TestProhibitedWordDetection:
    """违禁词检测测试 - 表格驱动"""

    @pytest.mark.parametrize("text,expected_violations,context", [
        # 广告语境下的违禁词 - 应检出
        ("这是全网销量第一的产品", ["全网第一"], "advertisement"),
        ("我们是行业领导者", ["行业领导者"], "advertisement"),
        ("史上最低价促销", ["史上最低价"], "advertisement"),

        # 日常语境下的相同词 - 不应检出
        ("今天是我最开心的一天", [], "daily_conversation"),
        ("这是我第一次来这里", [], "daily_conversation"),

        # 边界情况
        ("", [], "advertisement"),
        ("这是一个普通的产品介绍", [], "advertisement"),

        # 组合违禁词
        ("全网销量第一，史上最低价", ["全网第一", "史上最低价"], "advertisement"),
    ])
    def test_prohibited_word_detection(self, text, expected_violations, context):
        """验证违禁词检测的准确性"""
        engine = RuleEngine()
        result = engine.detect_prohibited_words(text, context=context)

        assert set(result.violations) == set(expected_violations)

4.3.2 时间戳对齐算法测试

# tests/unit/test_timestamp.py

import pytest
from app.utils.timestamp_align import TimestampAligner

class TestMultiModalAlignment:
    """多模态时间戳对齐测试"""

    @pytest.fixture
    def aligner(self):
        return TimestampAligner(tolerance_ms=500)

    @pytest.mark.parametrize("asr_ts,ocr_ts,cv_ts,expected_merged", [
        # 完全对齐
        (1000, 1000, 1000, 1000),
        # 容差范围内对齐
        (1000, 1200, 1100, 1100),  # 取中位数
        # 超出容差
        (1000, 2000, 3000, None),  # 不合并
    ])
    def test_timestamp_alignment(self, aligner, asr_ts, ocr_ts, cv_ts, expected_merged):
        """验证时间戳对齐逻辑"""
        events = [
            {"source": "asr", "timestamp_ms": asr_ts, "content": "test"},
            {"source": "ocr", "timestamp_ms": ocr_ts, "content": "test"},
            {"source": "cv", "timestamp_ms": cv_ts, "content": "logo_detected"},
        ]

        merged = aligner.merge_events(events)

        if expected_merged:
            assert len(merged) == 1
            assert merged[0]["timestamp_ms"] == expected_merged
        else:
            assert len(merged) == 3  # 未合并

    def test_duration_calculation_accuracy(self, aligner):
        """验证时长统计误差 ≤ 0.5秒"""
        events = [
            {"timestamp_ms": 0, "type": "product_appear"},
            {"timestamp_ms": 5500, "type": "product_disappear"},
        ]

        duration = aligner.calculate_duration(events)

        # 误差应 ≤ 500ms
        assert abs(duration - 5500) <= 500

4.3.3 Brief 解析测试

# tests/unit/test_brief_parser.py

import pytest
from app.services.brief_parser import BriefParser

class TestBriefParsing:
    """Brief 解析逻辑测试"""

    @pytest.fixture
    def parser(self):
        return BriefParser()

    def test_extract_selling_points(self, parser):
        """验证卖点提取"""
        brief_content = """
        产品核心卖点：
        1. 24小时持妆
        2. 天然成分
        3. 敏感肌适用
        """

        result = parser.extract_selling_points(brief_content)

        assert "24小时持妆" in result.selling_points
        assert "天然成分" in result.selling_points
        assert "敏感肌适用" in result.selling_points

    def test_extract_prohibited_words(self, parser):
        """验证禁忌词提取"""
        brief_content = """
        禁止使用的词汇：
        - 药用
        - 治疗
        - 根治
        """

        result = parser.extract_prohibited_words(brief_content)

        assert set(result.prohibited_words) == {"药用", "治疗", "根治"}

    def test_conflict_detection(self, parser):
        """验证 Brief 与平台规则冲突检测"""
        brief_rules = {"allowed_words": ["最佳效果"]}
        platform_rules = {"prohibited_words": ["最佳"]}

        conflicts = parser.detect_conflicts(brief_rules, platform_rules)

        assert len(conflicts) == 1
        assert "最佳效果" in conflicts[0]["conflicting_term"]

4.4 集成测试策略

4.4.1 数据库集成测试

# tests/integration/conftest.py

import pytest
from testcontainers.postgres import PostgresContainer
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

@pytest.fixture(scope="session")
def postgres_container():
    """启动 PostgreSQL 测试容器"""
    with PostgresContainer("postgres:15-alpine") as postgres:
        yield postgres

@pytest.fixture(scope="function")
def db_session(postgres_container):
    """每个测试函数独立的数据库会话"""
    engine = create_engine(postgres_container.get_connection_url())
    Session = sessionmaker(bind=engine)
    session = Session()

    # 创建表
    Base.metadata.create_all(engine)

    yield session

    # 清理
    session.rollback()
    session.close()

4.4.2 API 集成测试

# tests/integration/test_api_brief.py

import pytest
from httpx import AsyncClient
from app.main import app

class TestBriefAPI:
    """Brief API 集成测试"""

    @pytest.fixture
    async def client(self):
        async with AsyncClient(app=app, base_url="http://test") as ac:
            yield ac

    @pytest.fixture
    async def auth_headers(self, client):
        """获取认证头"""
        response = await client.post("/auth/login", json={
            "username": "test_agency",
            "password": "password"
        })
        token = response.json()["access_token"]
        return {"Authorization": f"Bearer {token}"}

    async def test_upload_brief_pdf(self, client, auth_headers, tmp_path):
        """测试 Brief PDF 上传"""
        # 准备测试文件
        test_pdf = tmp_path / "test_brief.pdf"
        test_pdf.write_bytes(b"%PDF-1.4 test content")

        with open(test_pdf, "rb") as f:
            response = await client.post(
                "/api/v1/briefs/upload",
                files={"file": ("test.pdf", f, "application/pdf")},
                headers=auth_headers
            )

        assert response.status_code == 202
        assert "task_id" in response.json()

    async def test_get_brief_parsing_result(self, client, auth_headers):
        """测试获取 Brief 解析结果"""
        # 假设已有解析完成的 Brief
        brief_id = "test-brief-id"

        response = await client.get(
            f"/api/v1/briefs/{brief_id}",
            headers=auth_headers
        )

        assert response.status_code == 200
        result = response.json()
        assert "selling_points" in result
        assert "prohibited_words" in result

4.5 Celery 异步任务测试

# tests/integration/test_celery_tasks.py

import pytest
from unittest.mock import patch, MagicMock
from app.tasks.video_auditing import audit_video_task

class TestVideoAuditingTask:
    """视频审核异步任务测试"""

    @pytest.fixture
    def mock_ai_services(self):
        """Mock 所有 AI 服务"""
        with patch("app.tasks.video_auditing.ASRService") as mock_asr, \
             patch("app.tasks.video_auditing.OCRService") as mock_ocr, \
             patch("app.tasks.video_auditing.CVService") as mock_cv:

            mock_asr.return_value.transcribe.return_value = {
                "text": "这是测试文本",
                "timestamps": [{"start": 0, "end": 1000, "text": "这是测试文本"}]
            }

            mock_ocr.return_value.extract.return_value = {
                "frames": [{"timestamp": 500, "text": "字幕内容"}]
            }

            mock_cv.return_value.detect.return_value = {
                "logos": [],
                "objects": [{"timestamp": 500, "object": "product"}]
            }

            yield {"asr": mock_asr, "ocr": mock_ocr, "cv": mock_cv}

    def test_video_audit_task_success(self, mock_ai_services, db_session):
        """测试视频审核任务成功执行"""
        task_id = "test-task-id"
        video_url = "https://test.oss.com/test.mp4"
        brief_id = "test-brief-id"

        # 使用 eager 模式同步执行
        result = audit_video_task.apply(
            args=[task_id, video_url, brief_id]
        ).get()

        assert result["status"] == "completed"
        assert "report" in result
        assert "risk_items" in result["report"]

    def test_video_audit_task_with_violations(self, mock_ai_services, db_session):
        """测试检测到违规时的处理"""
        # 修改 Mock 返回值，模拟检测到违禁词
        mock_ai_services["asr"].return_value.transcribe.return_value = {
            "text": "这是全网销量第一的产品",
            "timestamps": [{"start": 0, "end": 1000, "text": "这是全网销量第一的产品"}]
        }

        result = audit_video_task.apply(
            args=["test-task", "url", "brief-id"]
        ).get()

        assert result["status"] == "completed"
        assert len(result["report"]["risk_items"]) > 0
        assert any(
            item["type"] == "prohibited_word"
            for item in result["report"]["risk_items"]
        )

5. 前端测试策略

5.1 测试框架选型

用途	工具	说明
单元测试框架	Vitest	Vite 原生，极速执行
组件测试	@testing-library/react	用户行为驱动
DOM 断言	@testing-library/jest-dom	扩展匹配器
Mock 服务	MSW (Mock Service Worker)	API 拦截
E2E 测试	Playwright	跨浏览器
视觉回归	Percy / Chromatic	截图对比
覆盖率	@vitest/coverage-v8	覆盖率报告

5.2 目录结构

frontend/
├── src/
│   ├── components/
│   │   ├── Button/
│   │   │   ├── Button.tsx
│   │   │   ├── Button.test.tsx        # 组件测试
│   │   │   └── Button.stories.tsx     # Storybook (可选)
│   │   └── ...
│   │
│   ├── hooks/
│   │   ├── useAuth.ts
│   │   ├── useAuth.test.ts            # Hook 测试
│   │   └── ...
│   │
│   ├── services/
│   │   ├── api.ts
│   │   ├── api.test.ts                # 服务测试
│   │   └── ...
│   │
│   ├── store/
│   │   ├── auth.ts
│   │   ├── auth.test.ts               # 状态测试
│   │   └── ...
│   │
│   └── lib/
│       ├── utils.ts
│       ├── utils.test.ts              # 工具函数测试
│       └── ...
│
├── tests/
│   ├── setup.ts                       # 测试全局配置
│   ├── mocks/
│   │   ├── handlers.ts                # MSW 处理器
│   │   └── server.ts                  # MSW 服务器
│   │
│   ├── integration/                   # 集成测试
│   │   ├── BriefUpload.test.tsx
│   │   ├── VideoUpload.test.tsx
│   │   └── ReviewDashboard.test.tsx
│   │
│   └── e2e/                           # Playwright E2E
│       ├── creator-flow.spec.ts
│       ├── agency-flow.spec.ts
│       └── brand-flow.spec.ts
│
├── vitest.config.ts                   # Vitest 配置
├── playwright.config.ts               # Playwright 配置
└── package.json

5.3 单元测试示例

5.3.1 工具函数测试

// src/lib/utils.test.ts

import { describe, it, expect } from 'vitest'
import {
  formatDuration,
  formatTimestamp,
  truncateText,
  validateVideoFile
} from './utils'

describe('formatDuration', () => {
  it('格式化秒数为 mm:ss', () => {
    expect(formatDuration(65)).toBe('01:05')
    expect(formatDuration(3661)).toBe('61:01')
    expect(formatDuration(0)).toBe('00:00')
  })

  it('处理负数', () => {
    expect(formatDuration(-10)).toBe('00:00')
  })
})

describe('formatTimestamp', () => {
  it('格式化毫秒为 HH:MM:SS.mmm', () => {
    expect(formatTimestamp(1500)).toBe('00:00:01.500')
    expect(formatTimestamp(3661500)).toBe('01:01:01.500')
  })
})

describe('validateVideoFile', () => {
  it('接受有效的 MP4 文件', () => {
    const file = new File([''], 'test.mp4', { type: 'video/mp4' })
    Object.defineProperty(file, 'size', { value: 50 * 1024 * 1024 }) // 50MB

    const result = validateVideoFile(file)

    expect(result.valid).toBe(true)
  })

  it('拒绝超过 100MB 的文件', () => {
    const file = new File([''], 'test.mp4', { type: 'video/mp4' })
    Object.defineProperty(file, 'size', { value: 150 * 1024 * 1024 }) // 150MB

    const result = validateVideoFile(file)

    expect(result.valid).toBe(false)
    expect(result.error).toContain('100MB')
  })

  it('拒绝非视频格式', () => {
    const file = new File([''], 'test.pdf', { type: 'application/pdf' })

    const result = validateVideoFile(file)

    expect(result.valid).toBe(false)
    expect(result.error).toContain('格式')
  })
})

5.3.2 React Hook 测试

// src/hooks/useAuth.test.ts

import { renderHook, act, waitFor } from '@testing-library/react'
import { describe, it, expect, beforeEach, vi } from 'vitest'
import { useAuth } from './useAuth'
import { server } from '../tests/mocks/server'
import { rest } from 'msw'

describe('useAuth', () => {
  beforeEach(() => {
    localStorage.clear()
  })

  it('初始状态为未登录', () => {
    const { result } = renderHook(() => useAuth())

    expect(result.current.isAuthenticated).toBe(false)
    expect(result.current.user).toBeNull()
  })

  it('登录成功后更新状态', async () => {
    const { result } = renderHook(() => useAuth())

    await act(async () => {
      await result.current.login('test@example.com', 'password')
    })

    await waitFor(() => {
      expect(result.current.isAuthenticated).toBe(true)
      expect(result.current.user?.email).toBe('test@example.com')
    })
  })

  it('登录失败时抛出错误', async () => {
    // 模拟 API 返回错误
    server.use(
      rest.post('/api/auth/login', (req, res, ctx) => {
        return res(ctx.status(401), ctx.json({ error: '密码错误' }))
      })
    )

    const { result } = renderHook(() => useAuth())

    await expect(
      act(async () => {
        await result.current.login('test@example.com', 'wrong')
      })
    ).rejects.toThrow('密码错误')
  })

  it('登出后清除状态', async () => {
    const { result } = renderHook(() => useAuth())

    // 先登录
    await act(async () => {
      await result.current.login('test@example.com', 'password')
    })

    // 再登出
    act(() => {
      result.current.logout()
    })

    expect(result.current.isAuthenticated).toBe(false)
    expect(result.current.user).toBeNull()
  })
})

5.3.3 Zustand 状态测试

// src/store/upload.test.ts

import { describe, it, expect, beforeEach } from 'vitest'
import { useUploadStore } from './upload'

describe('useUploadStore', () => {
  beforeEach(() => {
    // 重置 store
    useUploadStore.setState({
      files: [],
      uploadProgress: {},
      isUploading: false,
    })
  })

  it('添加文件到上传队列', () => {
    const file = new File(['test'], 'test.mp4', { type: 'video/mp4' })

    useUploadStore.getState().addFile(file)

    expect(useUploadStore.getState().files).toHaveLength(1)
    expect(useUploadStore.getState().files[0].name).toBe('test.mp4')
  })

  it('更新上传进度', () => {
    const fileId = 'file-123'

    useUploadStore.getState().updateProgress(fileId, 50)

    expect(useUploadStore.getState().uploadProgress[fileId]).toBe(50)
  })

  it('移除已完成的文件', () => {
    const file = new File(['test'], 'test.mp4', { type: 'video/mp4' })
    useUploadStore.getState().addFile(file)

    const fileId = useUploadStore.getState().files[0].id
    useUploadStore.getState().removeFile(fileId)

    expect(useUploadStore.getState().files).toHaveLength(0)
  })
})

5.3.4 组件测试

// src/components/video/VideoUpload.test.tsx

import { render, screen, fireEvent, waitFor } from '@testing-library/react'
import userEvent from '@testing-library/user-event'
import { describe, it, expect, vi } from 'vitest'
import { VideoUpload } from './VideoUpload'

describe('VideoUpload', () => {
  it('渲染上传区域', () => {
    render(<VideoUpload onUpload={vi.fn()} />)

    expect(screen.getByText(/拖拽视频到此处/)).toBeInTheDocument()
    expect(screen.getByText(/支持 MP4、MOV 格式/)).toBeInTheDocument()
  })

  it('拖拽文件触发上传', async () => {
    const onUpload = vi.fn()
    render(<VideoUpload onUpload={onUpload} />)

    const dropzone = screen.getByTestId('dropzone')
    const file = new File(['video content'], 'test.mp4', { type: 'video/mp4' })

    fireEvent.drop(dropzone, {
      dataTransfer: { files: [file] }
    })

    await waitFor(() => {
      expect(onUpload).toHaveBeenCalledWith(file)
    })
  })

  it('拒绝超大文件并显示错误', async () => {
    const onUpload = vi.fn()
    render(<VideoUpload onUpload={onUpload} maxSize={100 * 1024 * 1024} />)

    const file = new File([''], 'large.mp4', { type: 'video/mp4' })
    Object.defineProperty(file, 'size', { value: 150 * 1024 * 1024 })

    const dropzone = screen.getByTestId('dropzone')
    fireEvent.drop(dropzone, {
      dataTransfer: { files: [file] }
    })

    await waitFor(() => {
      expect(screen.getByText(/文件大小不能超过 100MB/)).toBeInTheDocument()
      expect(onUpload).not.toHaveBeenCalled()
    })
  })

  it('显示上传进度', async () => {
    render(<VideoUpload onUpload={vi.fn()} initialProgress={45} />)

    expect(screen.getByRole('progressbar')).toHaveAttribute('aria-valuenow', '45')
    expect(screen.getByText('45%')).toBeInTheDocument()
  })
})

5.4 MSW Mock 服务配置

// tests/mocks/handlers.ts

import { rest } from 'msw'

export const handlers = [
  // 认证 API
  rest.post('/api/auth/login', async (req, res, ctx) => {
    const { email, password } = await req.json()

    if (password === 'password') {
      return res(ctx.json({
        access_token: 'mock-token',
        user: { id: '1', email, role: 'agency' }
      }))
    }

    return res(ctx.status(401), ctx.json({ error: '密码错误' }))
  }),

  // Brief API
  rest.get('/api/v1/briefs/:id', (req, res, ctx) => {
    return res(ctx.json({
      id: req.params.id,
      selling_points: ['24小时持妆', '天然成分'],
      prohibited_words: ['药用', '治疗'],
      status: 'parsed'
    }))
  }),

  // 视频上传 API
  rest.post('/api/v1/videos/upload', async (req, res, ctx) => {
    return res(ctx.json({
      task_id: 'mock-task-id',
      status: 'processing'
    }))
  }),

  // WebSocket 模拟
  // 注意：MSW 不支持 WebSocket，需要单独的 mock
]

// tests/mocks/server.ts

import { setupServer } from 'msw/node'
import { handlers } from './handlers'

export const server = setupServer(...handlers)

// tests/setup.ts

import { beforeAll, afterEach, afterAll } from 'vitest'
import { server } from './mocks/server'
import '@testing-library/jest-dom'

beforeAll(() => server.listen())
afterEach(() => server.resetHandlers())
afterAll(() => server.close())

5.5 前端测试自动化方案总结

测试类型	工具	覆盖范围	执行时机
单元测试	Vitest	工具函数、Hooks、Store	每次提交
组件测试	Vitest + RTL	UI 组件行为	每次提交
集成测试	Vitest + MSW	页面级交互	每次 PR
E2E 测试	Playwright	完整用户流程	每日/发布前
视觉回归	Percy/Chromatic	UI 外观变化	每次 PR
兼容性测试	BrowserStack	跨浏览器/设备	发布前

6. AI 模型测试策略

6.1 AI 测试的特殊性

AI 模型测试与传统单元测试有本质区别：

维度	传统测试	AI 模型测试
输出确定性	确定性输出	概率性输出
验证方式	精确匹配	阈值验证
测试数据	少量手工构造	大规模标注集
回归检测	断言失败	指标下降
维护成本	低	需持续更新

6.2 AI 测试分层

┌────────────────────────────────────────────────────────────────┐
│                      AI 模型测试分层                            │
├────────────────────────────────────────────────────────────────┤
│                                                                │
│   【第1层：接口契约测试】                                        │
│   • 输入格式验证                                                │
│   • 输出结构验证                                                │
│   • 错误处理验证                                                │
│   → 使用 pytest + JSON Schema 验证                             │
│                                                                │
│   【第2层：功能正确性测试】                                      │
│   • 标注测试集验证                                              │
│   • 边界情况覆盖                                                │
│   • 阈值达标检查                                                │
│   → 使用标注数据 + 指标计算                                     │
│                                                                │
│   【第3层：回归测试】                                            │
│   • 模型更新后的指标对比                                        │
│   • Prompt 修改后的行为验证                                     │
│   • 新增 Case 的持续覆盖                                        │
│   → 使用 MLflow + 版本对比                                      │
│                                                                │
│   【第4层：对抗测试】                                            │
│   • 边缘输入 (长文本、特殊字符、空输入)                          │
│   • 对抗样本 (刻意绕过检测)                                      │
│   • 压力测试 (高并发、大文件)                                    │
│   → 使用 fuzzing + 人工设计                                     │
│                                                                │
└────────────────────────────────────────────────────────────────┘

6.3 各 AI 模块测试策略

6.3.1 ASR 语音识别测试

# tests/ai/test_asr.py

import pytest
from app.ai.asr import ASRService

class TestASRService:
    """ASR 语音识别测试"""

    @pytest.fixture
    def asr(self):
        return ASRService()

    @pytest.fixture
    def test_audio_samples(self):
        """加载标注测试集"""
        return load_labeled_dataset("tests/fixtures/asr_samples/")

    def test_output_format(self, asr):
        """验证输出格式契约"""
        result = asr.transcribe("tests/fixtures/sample.wav")

        # 验证必需字段
        assert "text" in result
        assert "timestamps" in result
        assert isinstance(result["timestamps"], list)

        # 验证时间戳格式
        for ts in result["timestamps"]:
            assert "start" in ts
            assert "end" in ts
            assert "text" in ts
            assert ts["end"] >= ts["start"]

    def test_word_error_rate(self, asr, test_audio_samples):
        """验证字错率 ≤ 10%"""
        total_errors = 0
        total_words = 0

        for sample in test_audio_samples:
            result = asr.transcribe(sample["audio_path"])
            wer = calculate_wer(result["text"], sample["ground_truth"])
            total_errors += wer * len(sample["ground_truth"].split())
            total_words += len(sample["ground_truth"].split())

        overall_wer = total_errors / total_words

        assert overall_wer <= 0.10, f"WER {overall_wer:.2%} 超过阈值 10%"

    def test_timestamp_accuracy(self, asr, test_audio_samples):
        """验证时间戳准确性"""
        for sample in test_audio_samples:
            result = asr.transcribe(sample["audio_path"])

            # 验证起止时间与音频时长匹配
            audio_duration = get_audio_duration(sample["audio_path"])
            last_timestamp = result["timestamps"][-1]["end"]

            # 允许 500ms 误差
            assert abs(last_timestamp - audio_duration * 1000) <= 500

6.3.2 违禁词检测测试

# tests/ai/test_prohibited_words.py

import pytest
from app.ai.nlp import ProhibitedWordDetector

class TestProhibitedWordDetector:
    """违禁词检测测试"""

    @pytest.fixture
    def detector(self):
        return ProhibitedWordDetector()

    @pytest.fixture
    def labeled_dataset(self):
        """
        标注数据集格式：
        {
            "text": "这是全网销量第一的产品",
            "context": "advertisement",
            "expected_violations": ["全网第一"],
            "should_block": true
        }
        """
        return load_labeled_dataset("tests/fixtures/prohibited_words/")

    def test_recall_rate(self, detector, labeled_dataset):
        """验证召回率 ≥ 95%"""
        true_positives = 0
        false_negatives = 0

        for sample in labeled_dataset:
            if not sample["expected_violations"]:
                continue

            result = detector.detect(sample["text"], sample["context"])
            detected = set(result.violations)
            expected = set(sample["expected_violations"])

            true_positives += len(detected & expected)
            false_negatives += len(expected - detected)

        recall = true_positives / (true_positives + false_negatives)

        assert recall >= 0.95, f"召回率 {recall:.2%} 低于阈值 95%"

    def test_false_positive_rate(self, detector, labeled_dataset):
        """验证误报率 ≤ 5%"""
        false_positives = 0
        true_negatives = 0

        # 只测试不应有违规的样本
        negative_samples = [
            s for s in labeled_dataset
            if not s["expected_violations"]
        ]

        for sample in negative_samples:
            result = detector.detect(sample["text"], sample["context"])

            if result.violations:
                false_positives += 1
            else:
                true_negatives += 1

        fpr = false_positives / (false_positives + true_negatives)

        assert fpr <= 0.05, f"误报率 {fpr:.2%} 超过阈值 5%"

    def test_context_awareness(self, detector):
        """验证语境感知能力"""
        text = "这是我最开心的一天"

        # 广告语境 - 不应误报
        result_ad = detector.detect(text, context="advertisement")
        assert len(result_ad.violations) == 0, "日常用语在广告语境误报"

        # 日常语境 - 不应误报
        result_daily = detector.detect(text, context="daily_conversation")
        assert len(result_daily.violations) == 0, "日常用语误报"

6.3.3 Logo 向量检索测试

# tests/ai/test_logo_detection.py

import pytest
from app.ai.cv import LogoDetector

class TestLogoDetector:
    """Logo 检测测试"""

    @pytest.fixture
    def detector(self):
        return LogoDetector()

    @pytest.fixture
    def logo_test_set(self):
        """
        测试集包含：
        - 200+ 竞品 Logo 图片
        - 各种遮挡、模糊、旋转场景
        - 负样本（无 Logo 的图片）
        """
        return load_labeled_dataset("tests/fixtures/logos/")

    def test_f1_score(self, detector, logo_test_set):
        """验证 F1 ≥ 0.85"""
        predictions = []
        ground_truths = []

        for sample in logo_test_set:
            result = detector.detect(sample["image_path"])
            predictions.append(result.detected_logos)
            ground_truths.append(sample["ground_truth_logos"])

        f1 = calculate_f1(predictions, ground_truths)

        assert f1 >= 0.85, f"F1 {f1:.2f} 低于阈值 0.85"

    def test_partial_occlusion(self, detector, logo_test_set):
        """验证 30% 遮挡场景下的检测能力"""
        occluded_samples = [
            s for s in logo_test_set
            if s.get("occlusion_rate", 0) >= 0.3
        ]

        correct = 0
        for sample in occluded_samples:
            result = detector.detect(sample["image_path"])
            if sample["ground_truth_logos"] == result.detected_logos:
                correct += 1

        accuracy = correct / len(occluded_samples)

        # 遮挡场景允许稍低的准确率
        assert accuracy >= 0.75, f"遮挡场景准确率 {accuracy:.2%} 过低"

    def test_new_logo_instant_detection(self, detector):
        """验证新 Logo 上传后即刻生效"""
        # 上传新 Logo
        new_logo_path = "tests/fixtures/new_competitor_logo.png"
        detector.add_logo(new_logo_path, brand="New Competitor")

        # 立即测试检测
        test_frame = "tests/fixtures/frame_with_new_logo.jpg"
        result = detector.detect(test_frame)

        assert "New Competitor" in result.detected_logos

6.4 LLM Prompt 测试

# tests/ai/test_llm.py

import pytest
from app.ai.llm import LLMService

class TestLLMPrompts:
    """LLM Prompt 测试"""

    @pytest.fixture
    def llm(self):
        return LLMService()

    @pytest.fixture
    def few_shot_examples(self):
        """Few-shot 示例集"""
        return load_few_shot_examples("tests/fixtures/llm_examples/")

    def test_brief_parsing_output_format(self, llm):
        """验证 Brief 解析输出格式"""
        brief_content = """
        产品卖点：24小时持妆
        禁止使用：药用、治疗
        """

        result = llm.parse_brief(brief_content)

        # 验证输出结构
        assert "selling_points" in result
        assert "prohibited_words" in result
        assert isinstance(result["selling_points"], list)
        assert isinstance(result["prohibited_words"], list)

    def test_context_understanding(self, llm, few_shot_examples):
        """验证语境理解能力"""
        context_examples = [
            e for e in few_shot_examples
            if e["type"] == "context_understanding"
        ]

        correct = 0
        for example in context_examples:
            result = llm.classify_context(example["text"])
            if result["context"] == example["expected_context"]:
                correct += 1

        accuracy = correct / len(context_examples)

        assert accuracy >= 0.90, f"语境理解准确率 {accuracy:.2%} 过低"

    def test_sentiment_analysis(self, llm):
        """验证舆情风险检测"""
        test_cases = [
            {"text": "这个产品太油腻了", "expected_risk": "greasy"},
            {"text": "正常的产品介绍", "expected_risk": None},
            {"text": "男人就该这样", "expected_risk": "gender_bias"},
        ]

        for case in test_cases:
            result = llm.analyze_sentiment(case["text"])

            if case["expected_risk"]:
                assert result.risk_type == case["expected_risk"]
            else:
                assert result.risk_type is None

6.5 AI 测试数据集管理

tests/fixtures/
├── asr_samples/                    # ASR 测试集
│   ├── manifest.json               # 数据清单
│   ├── audio/
│   │   ├── sample_001.wav
│   │   └── ...
│   └── transcripts/
│       ├── sample_001.json         # 标注结果
│       └── ...
│
├── prohibited_words/               # 违禁词测试集
│   ├── positive_samples.json       # 应检出样本
│   ├── negative_samples.json       # 不应检出样本
│   └── context_samples.json        # 语境测试样本
│
├── logos/                          # Logo 测试集
│   ├── manifest.json
│   ├── images/
│   │   ├── logo_001.jpg
│   │   └── ...
│   └── annotations/
│       ├── logo_001.json
│       └── ...
│
├── llm_examples/                   # LLM 测试集
│   ├── brief_parsing.json
│   ├── context_understanding.json
│   └── sentiment_analysis.json
│
└── README.md                       # 测试集说明文档

7. 端到端测试策略

7.1 E2E 测试框架

工具	用途
Playwright	跨浏览器 E2E 测试
@playwright/test	测试运行器
playwright-report	测试报告
BrowserStack	真机云测试

7.2 核心用户流程测试

// tests/e2e/creator-flow.spec.ts

import { test, expect } from '@playwright/test'

test.describe('达人端完整流程', () => {
  test.beforeEach(async ({ page }) => {
    // 登录达人账号
    await page.goto('/auth/login')
    await page.fill('[name="email"]', 'creator@test.com')
    await page.fill('[name="password"]', 'password')
    await page.click('button[type="submit"]')
    await expect(page).toHaveURL('/creator/tasks')
  })

  test('上传视频 → 等待审核 → 查看结果', async ({ page }) => {
    // 1. 进入上传页面
    await page.click('text=上传')
    await expect(page).toHaveURL('/creator/upload')

    // 2. 上传视频
    const fileInput = page.locator('input[type="file"]')
    await fileInput.setInputFiles('tests/fixtures/test_video.mp4')

    // 3. 等待上传完成
    await expect(page.locator('.upload-progress')).toHaveText(/100%/)

    // 4. 等待审核完成（可能需要等待）
    await page.click('button:has-text("提交审核")')

    // 5. 验证进入审核中状态
    await expect(page.locator('.audit-status')).toHaveText(/审核中/)

    // 6. 等待审核完成（最多 5 分钟）
    await expect(page.locator('.audit-status')).toHaveText(
      /已通过|需修改/,
      { timeout: 300000 }
    )

    // 7. 验证结果页面
    await page.click('text=查看结果')
    await expect(page.locator('.result-banner')).toBeVisible()
  })

  test('申诉流程', async ({ page }) => {
    // 假设有一个需修改的任务
    await page.goto('/creator/tasks?status=needs_revision')
    await page.click('.task-card >> nth=0')

    // 1. 点击申诉按钮
    await page.click('button:has-text("申诉")')

    // 2. 填写申诉理由
    await page.fill('textarea[name="reason"]', '这不是广告用语，是日常表达')

    // 3. 提交申诉
    await page.click('button:has-text("提交申诉")')

    // 4. 验证申诉成功
    await expect(page.locator('.toast')).toHaveText(/申诉已提交/)
  })
})

// tests/e2e/agency-flow.spec.ts

import { test, expect } from '@playwright/test'

test.describe('代理商端完整流程', () => {
  test.beforeEach(async ({ page }) => {
    // 登录代理商账号
    await page.goto('/auth/login')
    await page.fill('[name="email"]', 'agency@test.com')
    await page.fill('[name="password"]', 'password')
    await page.click('button[type="submit"]')
    await expect(page).toHaveURL('/agency/dashboard')
  })

  test('配置 Brief → 审核视频 → 通过', async ({ page }) => {
    // 1. 上传 Brief
    await page.click('text=Brief 管理')
    await page.click('button:has-text("上传 Brief")')

    const fileInput = page.locator('input[type="file"]')
    await fileInput.setInputFiles('tests/fixtures/test_brief.pdf')

    // 2. 等待解析完成
    await expect(page.locator('.parsing-status')).toHaveText(/解析完成/, {
      timeout: 60000
    })

    // 3. 确认规则
    await page.click('button:has-text("确认规则")')

    // 4. 进入审核台
    await page.click('text=审核台')
    await page.click('.pending-task >> nth=0')

    // 5. 查看视频和检查单
    await expect(page.locator('.video-player')).toBeVisible()
    await expect(page.locator('.checklist')).toBeVisible()

    // 6. 通过审核
    await page.click('button:has-text("通过")')
    await page.click('button:has-text("确认")')

    // 7. 验证状态更新
    await expect(page.locator('.task-status')).toHaveText(/已通过/)
  })

  test('驳回视频', async ({ page }) => {
    await page.goto('/agency/review')
    await page.click('.pending-task >> nth=0')

    // 勾选问题
    await page.check('input[name="issue_0"]')
    await page.check('input[name="issue_1"]')

    // 驳回
    await page.click('button:has-text("驳回")')
    await page.click('button:has-text("确认")')

    await expect(page.locator('.task-status')).toHaveText(/已驳回/)
  })
})

7.3 移动端 E2E 测试

// tests/e2e/mobile-creator.spec.ts

import { test, expect, devices } from '@playwright/test'

test.use({
  ...devices['iPhone 13'],
})

test.describe('达人端 H5 移动端测试', () => {
  test('移动端上传视频', async ({ page }) => {
    await page.goto('/creator/upload')

    // 验证移动端布局
    await expect(page.locator('.bottom-nav')).toBeVisible()

    // 验证防锁屏提示
    await expect(page.locator('.wakelock-hint')).toBeVisible()

    // 模拟上传
    const fileInput = page.locator('input[type="file"]')
    await fileInput.setInputFiles('tests/fixtures/test_video.mp4')

    // 验证进度显示
    await expect(page.locator('.circular-progress')).toBeVisible()
  })
})

7.4 Playwright 配置

// playwright.config.ts

import { defineConfig, devices } from '@playwright/test'

export default defineConfig({
  testDir: './tests/e2e',
  timeout: 60000,
  retries: 2,
  workers: 4,

  reporter: [
    ['html', { outputFolder: 'playwright-report' }],
    ['junit', { outputFile: 'test-results/junit.xml' }],
  ],

  use: {
    baseURL: process.env.TEST_BASE_URL || 'http://localhost:3000',
    trace: 'on-first-retry',
    screenshot: 'only-on-failure',
    video: 'on-first-retry',
  },

  projects: [
    // 桌面浏览器
    { name: 'chromium', use: { ...devices['Desktop Chrome'] } },
    { name: 'firefox', use: { ...devices['Desktop Firefox'] } },
    { name: 'webkit', use: { ...devices['Desktop Safari'] } },

    // 移动端
    { name: 'mobile-chrome', use: { ...devices['Pixel 5'] } },
    { name: 'mobile-safari', use: { ...devices['iPhone 13'] } },
  ],

  // 本地开发服务器
  webServer: {
    command: 'npm run dev',
    url: 'http://localhost:3000',
    reuseExistingServer: !process.env.CI,
  },
})

8. 实施路线图

8.1 分阶段实施计划

┌──────────────────────────────────────────────────────────────────────┐
│                     TDD 实施路线图 (11 周)                            │
├──────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  Phase 0: 基础建设 (Week 0, 并行进行)                                │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │ • 搭建 pytest/Vitest 测试框架                                   │ │
│  │ • 配置 CI/CD 测试流水线                                        │ │
│  │ • 建立代码覆盖率门禁                                           │ │
│  │ • 准备 AI 测试数据集 (初始 100+ 样本)                          │ │
│  │ • 团队 TDD 培训                                                │ │
│  └────────────────────────────────────────────────────────────────┘ │
│                              ↓                                       │
│  Phase 1: 基础设施 + Brief 引擎 (Week 1-2)                          │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │ 后端 TDD (100% 覆盖):                                          │ │
│  │ • 数据模型测试 → 数据模型实现                                   │ │
│  │ • 验证器测试 → 验证器实现                                       │ │
│  │ • 规则引擎测试 → 规则引擎实现                                   │ │
│  │ • Brief 解析测试 → Brief 解析实现                               │ │
│  │                                                                 │ │
│  │ 前端 TDD (基础组件):                                            │ │
│  │ • 工具函数测试 → 工具函数实现                                   │ │
│  │ • 基础组件测试 → 基础组件实现                                   │ │
│  │                                                                 │ │
│  │ API Mock 服务搭建                                               │ │
│  └────────────────────────────────────────────────────────────────┘ │
│                              ↓                                       │
│  Phase 2: 核心 AI 流水线 (Week 3-6)                                 │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │ AI 模型测试 (标注集验证):                                       │ │
│  │ • 建立 ASR/OCR/CV 测试集 (≥500 样本)                           │ │
│  │ • 接口契约测试 → AI 服务封装                                    │ │
│  │ • 阈值验证测试 → 模型调优                                       │ │
│  │                                                                 │ │
│  │ 核心算法 TDD:                                                   │ │
│  │ • 时间戳对齐测试 → 对齐算法实现                                 │ │
│  │ • 多模态融合测试 → 融合逻辑实现                                 │ │
│  │                                                                 │ │
│  │ 集成测试:                                                       │ │
│  │ • Celery 任务测试                                               │ │
│  │ • WebSocket 推送测试                                            │ │
│  └────────────────────────────────────────────────────────────────┘ │
│                              ↓                                       │
│  Phase 3: 界面开发 (Week 7-9)                                       │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │ 前端组件 TDD:                                                   │ │
│  │ • 组件测试 → 组件实现                                           │ │
│  │ • Hook 测试 → Hook 实现                                         │ │
│  │ • Store 测试 → Store 实现                                       │ │
│  │                                                                 │ │
│  │ 页面集成测试:                                                   │ │
│  │ • MSW Mock + 页面交互测试                                       │ │
│  │                                                                 │ │
│  │ E2E 测试骨架:                                                   │ │
│  │ • 核心用户流程 E2E (Playwright)                                 │ │
│  └────────────────────────────────────────────────────────────────┘ │
│                              ↓                                       │
│  Phase 4: 联调与验收 (Week 10-11)                                   │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │ 测试完善:                                                       │ │
│  │ • E2E 测试补全                                                  │ │
│  │ • 性能测试 (Locust)                                             │ │
│  │ • 兼容性测试 (BrowserStack)                                     │ │
│  │                                                                 │ │
│  │ AI 模型验收:                                                    │ │
│  │ • 完整测试集运行                                                │ │
│  │ • 指标达标验证                                                  │ │
│  │                                                                 │ │
│  │ 回归测试:                                                       │ │
│  │ • 全量回归                                                      │ │
│  │ • 冒烟测试自动化                                                │ │
│  └────────────────────────────────────────────────────────────────┘ │
│                                                                      │
└──────────────────────────────────────────────────────────────────────┘

8.2 每周测试交付物

周次	测试交付物	覆盖率目标
Week 1	后端框架测试、数据模型测试	后端 80%
Week 2	Brief 解析测试、规则引擎测试	后端 80%
Week 3	ASR/OCR 接口测试	AI 模块 60%
Week 4	CV 检测测试、向量检索测试	AI 模块 70%
Week 5	时间戳对齐测试、多模态融合测试	AI 模块 80%
Week 6	Celery 任务测试、WebSocket 测试	后端 85%
Week 7	前端工具函数测试、Hook 测试	前端 60%
Week 8	前端组件测试、Store 测试	前端 70%
Week 9	页面集成测试、E2E 骨架	前端 75%
Week 10	E2E 补全、性能测试	E2E 核心路径 100%
Week 11	兼容性测试、全量回归	整体 75%

9. 测试覆盖率目标

9.1 覆盖率门禁

层级	目标覆盖率	门禁策略
后端单元测试	≥ 80%	PR 阻断
前端单元测试	≥ 70%	PR 阻断
AI 模块测试	≥ 70%	PR 阻断
集成测试	≥ 60%	PR 警告
E2E 测试	核心路径 100%	发布阻断

9.2 覆盖率例外

以下代码可豁免覆盖率要求：

代码类型	原因
第三方 SDK 封装	信任上游
环境配置代码	运行时验证
日志/监控代码	非核心逻辑
迁移脚本	一次性执行

9.3 覆盖率报告

# .github/workflows/test.yml (覆盖率报告部分)

- name: Upload coverage to Codecov
  uses: codecov/codecov-action@v3
  with:
    files: ./coverage/coverage.xml
    fail_ci_if_error: true

- name: Coverage Gate
  run: |
    COVERAGE=$(cat coverage/coverage.txt | grep "TOTAL" | awk '{print $4}' | tr -d '%')
    if [ "$COVERAGE" -lt "75" ]; then
      echo "Coverage $COVERAGE% is below threshold 75%"
      exit 1
    fi

10. 工具链配置

10.1 后端工具链

# pyproject.toml

[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = ["test_*.py"]
python_functions = ["test_*"]
addopts = [
    "-v",
    "--tb=short",
    "--strict-markers",
    "-ra",
    "--cov=app",
    "--cov-report=xml",
    "--cov-report=html",
    "--cov-fail-under=75",
]
asyncio_mode = "auto"
markers = [
    "slow: 标记慢速测试",
    "integration: 集成测试",
    "ai: AI 模型测试",
]

[tool.coverage.run]
branch = true
source = ["app"]
omit = [
    "*/migrations/*",
    "*/tests/*",
    "*/__init__.py",
]

[tool.coverage.report]
exclude_lines = [
    "pragma: no cover",
    "def __repr__",
    "raise NotImplementedError",
    "if TYPE_CHECKING:",
]

10.2 前端工具链

// vitest.config.ts

import { defineConfig } from 'vitest/config'
import react from '@vitejs/plugin-react'
import path from 'path'

export default defineConfig({
  plugins: [react()],
  test: {
    globals: true,
    environment: 'jsdom',
    setupFiles: ['./tests/setup.ts'],
    include: ['**/*.test.{ts,tsx}'],
    coverage: {
      provider: 'v8',
      reporter: ['text', 'json', 'html'],
      exclude: [
        'node_modules/',
        'tests/',
        '**/*.d.ts',
        '**/*.config.*',
      ],
      thresholds: {
        lines: 70,
        functions: 70,
        branches: 70,
        statements: 70,
      },
    },
  },
  resolve: {
    alias: {
      '@': path.resolve(__dirname, './src'),
    },
  },
})

10.3 CI/CD 配置

# .github/workflows/test.yml

name: Test Suite

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main, develop]

jobs:
  backend-test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: test
        ports:
          - 5432:5432
      redis:
        image: redis:7
        ports:
          - 6379:6379

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
          cache: 'pip'

      - name: Install dependencies
        run: |
          cd backend
          pip install -r requirements.txt
          pip install -r requirements-dev.txt

      - name: Run linting
        run: |
          cd backend
          ruff check .
          mypy app

      - name: Run tests
        run: |
          cd backend
          pytest --cov --cov-report=xml
        env:
          DATABASE_URL: postgresql://postgres:test@localhost/test
          REDIS_URL: redis://localhost:6379

      - name: Upload coverage
        uses: codecov/codecov-action@v3
        with:
          files: backend/coverage.xml

  frontend-test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
          cache-dependency-path: frontend/package-lock.json

      - name: Install dependencies
        run: |
          cd frontend
          npm ci

      - name: Run linting
        run: |
          cd frontend
          npm run lint

      - name: Run tests
        run: |
          cd frontend
          npm run test:coverage

      - name: Upload coverage
        uses: codecov/codecov-action@v3
        with:
          files: frontend/coverage/coverage-final.json

  e2e-test:
    runs-on: ubuntu-latest
    needs: [backend-test, frontend-test]

    steps:
      - uses: actions/checkout@v4

      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install Playwright
        run: |
          cd frontend
          npm ci
          npx playwright install --with-deps

      - name: Run E2E tests
        run: |
          cd frontend
          npm run test:e2e

      - name: Upload Playwright report
        uses: actions/upload-artifact@v3
        if: failure()
        with:
          name: playwright-report
          path: frontend/playwright-report/

11. 团队规范与培训

11.1 TDD 工作流规范

┌────────────────────────────────────────────────────────────────────┐
│                       TDD 红-绿-重构循环                            │
├────────────────────────────────────────────────────────────────────┤
│                                                                    │
│   ┌─────────────┐                                                  │
│   │   🔴 RED    │  1. 编写一个失败的测试                           │
│   │   (失败)    │     • 测试必须能运行                             │
│   │             │     • 测试必须失败                               │
│   └──────┬──────┘     • 失败原因是功能未实现                       │
│          │                                                         │
│          ▼                                                         │
│   ┌─────────────┐                                                  │
│   │  🟢 GREEN   │  2. 编写最少的代码让测试通过                     │
│   │   (通过)    │     • 不要过度设计                               │
│   │             │     • 只写足够通过测试的代码                     │
│   └──────┬──────┘     • 可以"作弊"（硬编码）                       │
│          │                                                         │
│          ▼                                                         │
│   ┌─────────────┐                                                  │
│   │ 🔄 REFACTOR │  3. 重构代码                                     │
│   │   (重构)    │     • 移除重复                                   │
│   │             │     • 改善设计                                   │
│   └──────┬──────┘     • 测试仍然通过                               │
│          │                                                         │
│          └──────────────────────────────────────────────────────►  │
│                            循环                                     │
│                                                                    │
└────────────────────────────────────────────────────────────────────┘

11.2 测试命名规范

# Python (pytest)

class TestCalculator:
    def test_add_two_positive_numbers_returns_sum(self):
        """测试两个正数相加返回正确的和"""
        pass

    def test_divide_by_zero_raises_error(self):
        """测试除以零抛出错误"""
        pass

// TypeScript (Vitest)

describe('Calculator', () => {
  it('should return sum when adding two positive numbers', () => {
    // ...
  })

  it('should throw error when dividing by zero', () => {
    // ...
  })
})

11.3 测试文件组织规范

规则	说明
测试文件与源文件同目录	`utils.ts` → `utils.test.ts`
测试目录 `__tests__`	复杂模块可用目录
命名后缀 `.test.ts` / `_test.py`	便于识别和自动发现
每个测试文件只测一个模块	职责单一

11.4 团队培训计划

阶段	时长	内容	交付物
TDD 基础	2h	TDD 概念、红绿重构循环	培训 PPT
pytest 实战	2h	pytest 使用、fixture、参数化	示例代码
Vitest 实战	2h	Vitest 使用、RTL、MSW	示例代码
AI 测试	2h	标注集管理、阈值验证	测试模板
代码审查	持续	PR 中检查测试质量	审查清单

11.5 代码审查清单

## PR 测试审查清单

### 必须项
- [ ] 新功能有对应的单元测试
- [ ] 测试覆盖了正常路径和异常路径
- [ ] 测试命名清晰，描述预期行为
- [ ] 测试独立运行，不依赖执行顺序
- [ ] 覆盖率不低于门禁阈值

### 建议项
- [ ] 使用参数化测试覆盖多种输入
- [ ] Mock 外部依赖，避免测试不稳定
- [ ] 测试执行时间 < 1秒（单元测试）
- [ ] 无硬编码的测试数据（使用 fixture）

### AI 模块特别项
- [ ] 有对应的标注测试集
- [ ] 验证了输出格式
- [ ] 验证了阈值指标

12. 风险与挑战

12.1 风险矩阵

风险	可能性	影响	缓解措施
AI 模型幻觉	🔴 高	🔴 高	完整标注集 + 人工抽查 + 持续监控
测试数据不足	🟡 中	🔴 高	持续收集真实数据 + 数据增强
E2E 测试不稳定	🟡 中	🟡 中	重试机制 + 等待策略优化
团队 TDD 经验不足	🟡 中	🟡 中	培训 + 结对编程 + 代码审查
测试维护成本高	🟡 中	🟡 中	测试重构 + 共享 fixture
CI/CD 执行慢	🟢 低	🟡 中	并行执行 + 增量测试

12.2 AI 测试特殊挑战

挑战	应对策略
LLM 输出不确定性	验证结构而非精确内容 + 多次采样取共识
Prompt 变更影响大	建立 Prompt 版本管理 + 回归测试
标注成本高	优先覆盖高风险场景 + 主动学习采样
模型更新回归	建立基线 + 自动化指标对比
边缘情况难穷尽	对抗样本生成 + 持续收集 badcase

12.3 TDD 常见误区

误区	正确做法
先写代码再补测试	严格遵循红-绿-重构
追求 100% 覆盖率	关注有意义的测试
测试实现细节	测试行为和结果
过度 Mock	只 Mock 真正的外部依赖
测试代码不维护	测试代码同样需要重构

13. 附录

13.1 相关文档

文档	说明
tasks.md	开发任务清单
DevelopmentPlan.md	技术架构与开发计划
FeatureSummary.md	功能清单与验收标准
User_Role_Interfaces.md	用户角色与界面规范

13.2 参考资源

资源	链接
pytest 官方文档	https://docs.pytest.org/
Vitest 官方文档	https://vitest.dev/
Testing Library	https://testing-library.com/
Playwright 官方文档	https://playwright.dev/
MSW 官方文档	https://mswjs.io/
TDD by Example (书籍)	Kent Beck

13.3 术语表

术语	定义
TDD	Test-Driven Development，测试驱动开发
BDD	Behavior-Driven Development，行为驱动开发
SUT	System Under Test，被测系统
Fixture	测试固定装置，用于准备测试环境
Mock	模拟对象，替代真实依赖
Stub	存根，返回预设值的简化实现
Coverage	代码覆盖率
Regression	回归测试

14. 总结

14.1 核心结论

SmartAudit 项目高度适合实施 TDD
- 零代码起步，是最佳切入点
- 需求明确，验收标准量化
- 技术栈测试生态成熟
采用分层混合 TDD 策略
- 业务逻辑：严格 TDD
- AI 模型：标注集验证
- E2E：BDD + 自动化
前端测试自动化方案
- 单元测试：Vitest + Testing Library
- 组件测试：RTL + MSW
- E2E 测试：Playwright
- 视觉回归：Percy/Chromatic
关键成功因素
- 测试框架在 Week 0 搭建完成
- CI/CD 门禁从第一行代码开始
- AI 测试集持续积累
- 团队培训与规范执行

14.2 下一步行动

优先级	行动项	负责人	时间
P0	创建 backend/tests/ 目录结构	Backend Lead	Week 0
P0	配置 pytest + CI/CD	Backend Lead	Week 0
P0	创建 frontend/tests/ 目录结构	Frontend Lead	Week 0
P0	配置 Vitest + Playwright	Frontend Lead	Week 0
P0	团队 TDD 培训	Tech Lead	Week 0
P1	建立 AI 测试数据集框架	AI Engineer	Week 1
P1	编写核心模块测试规范文档	Tech Lead	Week 1

文档状态：✅ 完成 下次审阅：开发启动后 2 周

78 KiB Raw Blame History Unescape Escape

TDD 实施评估与计划

版本历史 (Version History)

目录

1. 项目现状诊断

1.1 代码库状态

1.2 技术栈测试生态评估

1.3 项目复杂度分析

2. TDD 可行性评估

2.1 综合评分

2.2 TDD 适用性分析

2.3 TDD 实施模式选择

3. 测试金字塔架构

3.1 测试层级分布

3.2 各层级职责划分

3.3 SmartAudit 测试分层详情

4. 后端测试策略

4.1 测试框架选型

4.2 目录结构

4.3 核心测试用例设计

4.3.1 规则引擎测试 (表格驱动)

4.3.2 时间戳对齐算法测试

4.3.3 Brief 解析测试

4.4 集成测试策略

4.4.1 数据库集成测试

4.4.2 API 集成测试

4.5 Celery 异步任务测试

5. 前端测试策略

5.1 测试框架选型

5.2 目录结构

5.3 单元测试示例

5.3.1 工具函数测试

5.3.2 React Hook 测试

5.3.3 Zustand 状态测试

5.3.4 组件测试

5.4 MSW Mock 服务配置

5.5 前端测试自动化方案总结

6. AI 模型测试策略

6.1 AI 测试的特殊性

6.2 AI 测试分层

6.3 各 AI 模块测试策略

6.3.1 ASR 语音识别测试

6.3.2 违禁词检测测试

6.3.3 Logo 向量检索测试

6.4 LLM Prompt 测试

6.5 AI 测试数据集管理

7. 端到端测试策略

7.1 E2E 测试框架

7.2 核心用户流程测试

7.3 移动端 E2E 测试

7.4 Playwright 配置

8. 实施路线图

8.1 分阶段实施计划

8.2 每周测试交付物

9. 测试覆盖率目标

9.1 覆盖率门禁

9.2 覆盖率例外

9.3 覆盖率报告

10. 工具链配置

10.1 后端工具链

10.2 前端工具链

10.3 CI/CD 配置

11. 团队规范与培训

11.1 TDD 工作流规范

11.2 测试命名规范

11.3 测试文件组织规范

11.4 团队培训计划

11.5 代码审查清单

12. 风险与挑战

12.1 风险矩阵

12.2 AI 测试特殊挑战

12.3 TDD 常见误区

13. 附录

13.1 相关文档

13.2 参考资源

13.3 术语表

14. 总结

14.1 核心结论

14.2 下一步行动

78 KiB

Raw Blame History