# TDD 实施评估与计划

| 文档类型 | **Testing Strategy (测试驱动开发实施计划)** |
| --- | --- |
| **项目名称** | SmartAudit (AI 营销内容合规审核平台) |
| **版本号** | V1.0 |
| **发布日期** | 2026-02-02 |
| **关联文档** | tasks.md, DevelopmentPlan.md, FeatureSummary.md |

---

## 版本历史 (Version History)

| 版本 | 日期 | 作者 | 变更说明 |
| --- | --- | --- | --- |
| V1.0 | 2026-02-02 | Claude | 初稿：项目诊断、TDD可行性评估、实施计划 |

---

## 目录

1. [项目现状诊断](#1-项目现状诊断)
2. [TDD 可行性评估](#2-tdd-可行性评估)
3. [测试金字塔架构](#3-测试金字塔架构)
4. [后端测试策略](#4-后端测试策略)
5. [前端测试策略](#5-前端测试策略)
6. [AI 模型测试策略](#6-ai-模型测试策略)
7. [端到端测试策略](#7-端到端测试策略)
8. [实施路线图](#8-实施路线图)
9. [测试覆盖率目标](#9-测试覆盖率目标)
10. [工具链配置](#10-工具链配置)
11. [团队规范与培训](#11-团队规范与培训)
12. [风险与挑战](#12-风险与挑战)

---

## 1. 项目现状诊断

### 1.1 代码库状态

| 维度 | 当前状态 | 评估 |
| --- | --- | --- |
| **源代码** | 零代码，纯需求阶段 | ✅ 最佳TDD切入点 |
| **文档完整度** | 5,796行，覆盖PRD/RD/技术架构/UI | ✅ 需求明确 |
| **技术选型** | 已确定：FastAPI + Next.js | ✅ 测试生态成熟 |
| **任务拆解** | 77个开发任务，优先级明确 | ✅ 粒度适合TDD |
| **验收标准** | 每个功能有量化指标 | ✅ 可直接转化为测试用例 |
| **CI/CD** | 已规划，待实施 | ⚠️ 需同步搭建 |

### 1.2 技术栈测试生态评估

| 技术 | 测试框架支持 | 生态成熟度 | Mock/Stub 支持 |
| --- | --- | --- | --- |
| **FastAPI** | pytest + httpx | ⭐⭐⭐⭐⭐ | TestClient 内置 |
| **Celery** | pytest-celery | ⭐⭐⭐⭐ | eager mode 支持 |
| **PostgreSQL** | TestContainers | ⭐⭐⭐⭐⭐ | 容器化隔离 |
| **Next.js/React** | Vitest + RTL | ⭐⭐⭐⭐⭐ | MSW 拦截 |
| **Zustand** | 原生测试支持 | ⭐⭐⭐⭐ | 无需特殊处理 |
| **Socket.io** | jest-socket.io-mock | ⭐⭐⭐ | 需手动 Mock |

### 1.3 项目复杂度分析

```
复杂度热力图：

┌─────────────────────────────────────────────────────────┐
│ 模块                        │ 业务复杂度 │ 测试难度  │
├─────────────────────────────────────────────────────────┤
│ 认证与权限 (RBAC)           │    ██░░░   │   ██░░░   │
│ Brief 解析 (LLM)            │    ████░   │   ████░   │
│ 规则引擎                    │    ███░░   │   ██░░░   │
│ 视频上传 (Tus)              │    ██░░░   │   ███░░   │
│ ASR/OCR/CV 流水线           │    █████   │   █████   │  ← 最高
│ 多模态时间戳对齐            │    █████   │   █████   │  ← 最高
│ WebSocket 进度推送          │    ██░░░   │   ███░░   │
│ 审核决策流程                │    ███░░   │   ██░░░   │
│ 数据看板                    │    ██░░░   │   ██░░░   │
│ 移动端 H5                   │    ██░░░   │   ███░░   │
└─────────────────────────────────────────────────────────┘
```

**诊断结论**：
- ✅ **绿灯项目**：零代码起步，是实施TDD的理想时机
- ✅ 技术栈测试生态成熟，无明显阻碍
- ⚠️ AI流水线（ASR/OCR/CV）测试需特殊策略

---

## 2. TDD 可行性评估

### 2.1 综合评分

| 评估维度 | 评分 | 说明 |
| --- | --- | --- |
| **需求明确性** | ⭐⭐⭐⭐⭐ | PRD/RD 详尽，用户故事完整 |
| **功能粒度** | ⭐⭐⭐⭐⭐ | 77个任务，边界清晰 |
| **技术可测性** | ⭐⭐⭐⭐ | 主流框架，生态成熟 |
| **团队规模** | ⭐⭐⭐⭐ | 8人精干团队，沟通高效 |
| **时间充裕度** | ⭐⭐⭐⭐ | 11周排期，非极限压缩 |
| **验收标准量化** | ⭐⭐⭐⭐⭐ | 每个功能有明确KPI |

**总体评估：🟢 高度可行 (95分/100)**

### 2.2 TDD 适用性分析

| 模块类型 | TDD 适用度 | 推荐策略 |
| --- | --- | --- |
| **纯业务逻辑** | ⭐⭐⭐⭐⭐ | 严格 TDD（先写测试） |
| **API 接口** | ⭐⭐⭐⭐⭐ | 契约测试 + TDD |
| **数据模型** | ⭐⭐⭐⭐ | TDD + Schema 验证 |
| **规则引擎** | ⭐⭐⭐⭐⭐ | 表格驱动测试 + TDD |
| **AI 模型调用** | ⭐⭐⭐ | 混合模式（输入输出验证） |
| **AI Prompt** | ⭐⭐ | 标注测试集验证 |
| **UI 组件** | ⭐⭐⭐⭐ | 组件级 TDD |
| **E2E 流程** | ⭐⭐ | BDD + E2E 测试 |

### 2.3 TDD 实施模式选择

推荐采用 **"分层混合 TDD"** 模式：

```
┌──────────────────────────────────────────────────────────────┐
│                    分层混合 TDD 模式                          │
├──────────────────────────────────────────────────────────────┤
│                                                              │
│   ┌─────────────────────────────────────────────────────┐   │
│   │  第1层：严格 TDD (100% 覆盖)                         │   │
│   │  • 工具函数 (utils)                                  │   │
│   │  • 数据验证器 (validators)                           │   │
│   │  • 规则引擎 (rule engine)                            │   │
│   │  • 业务逻辑服务 (services)                           │   │
│   └─────────────────────────────────────────────────────┘   │
│                           ↓                                  │
│   ┌─────────────────────────────────────────────────────┐   │
│   │  第2层：契约优先 (Contract-First)                    │   │
│   │  • API 接口 → 先定义 OpenAPI                         │   │
│   │  • 数据模型 → 先定义 Schema                          │   │
│   │  • WebSocket 消息 → 先定义消息格式                   │   │
│   └─────────────────────────────────────────────────────┘   │
│                           ↓                                  │
│   ┌─────────────────────────────────────────────────────┐   │
│   │  第3层：标注集验证 (AI 模型)                         │   │
│   │  • ASR/OCR/CV → 标注测试集 + 阈值验证                │   │
│   │  • LLM Prompt → Few-shot 示例 + 定期回归            │   │
│   │  • 向量检索 → 召回率/精确率评估                      │   │
│   └─────────────────────────────────────────────────────┘   │
│                           ↓                                  │
│   ┌─────────────────────────────────────────────────────┐   │
│   │  第4层：行为驱动 (BDD + E2E)                         │   │
│   │  • 用户故事 → Playwright E2E                         │   │
│   │  • 关键路径 → 冒烟测试                               │   │
│   └─────────────────────────────────────────────────────┘   │
│                                                              │
└──────────────────────────────────────────────────────────────┘
```

---

## 3. 测试金字塔架构

### 3.1 测试层级分布

```
                        ┌─────────────┐
                        │   E2E 测试   │  5%
                        │  (Playwright)│
                        └──────┬──────┘
                               │
                    ┌──────────┴──────────┐
                    │     集成测试        │  20%
                    │  (API + DB + 外部)   │
                    └──────────┬──────────┘
                               │
            ┌──────────────────┴──────────────────┐
            │            单元测试                  │  75%
            │   (函数、类、组件、纯逻辑)            │
            └─────────────────────────────────────┘
```

### 3.2 各层级职责划分

| 层级 | 占比 | 覆盖范围 | 执行频率 | 执行时间 |
| --- | --- | --- | --- | --- |
| **单元测试** | 75% | 函数/类/组件/纯逻辑 | 每次提交 | < 30秒 |
| **集成测试** | 20% | API/DB/消息队列/外部服务 | 每次PR | < 5分钟 |
| **E2E 测试** | 5% | 完整用户流程 | 每日/发布前 | < 15分钟 |

### 3.3 SmartAudit 测试分层详情

```
┌────────────────────────────────────────────────────────────────────┐
│                      SmartAudit 测试分层                            │
├────────────────────────────────────────────────────────────────────┤
│                                                                    │
│  【E2E 测试层】 Playwright                                         │
│  ├─ 达人流程：上传视频 → 等待审核 → 查看结果 → 申诉                │
│  ├─ 代理商流程：配置Brief → 审核视频 → 驳回/通过                   │
│  └─ 品牌方流程：查看看板 → 配置规则 → 审批强制通过                 │
│                                                                    │
│  【集成测试层】 pytest + TestContainers                             │
│  ├─ API 接口测试 (httpx TestClient)                                │
│  ├─ 数据库集成测试 (PostgreSQL + pgvector)                         │
│  ├─ Redis 缓存测试                                                 │
│  ├─ Celery 任务测试 (eager mode)                                   │
│  ├─ 文件上传测试 (OSS Mock)                                        │
│  └─ WebSocket 推送测试                                             │
│                                                                    │
│  【单元测试层】                                                     │
│  │                                                                 │
│  │  后端 (pytest)                       前端 (Vitest)              │
│  │  ├─ 工具函数                         ├─ 工具函数                │
│  │  ├─ 数据验证器                       ├─ 格式化函数              │
│  │  ├─ 规则引擎逻辑                     ├─ 状态管理 (Zustand)      │
│  │  ├─ 时间戳对齐算法                   ├─ React Hooks            │
│  │  ├─ Brief 解析逻辑                   ├─ UI 组件                 │
│  │  ├─ 业务服务方法                     └─ 表单验证逻辑            │
│  │  └─ Pydantic 模型                                               │
│  │                                                                 │
│  │  AI 模型 (标注测试集)                                           │
│  │  ├─ ASR 输出格式验证                                            │
│  │  ├─ OCR 输出格式验证                                            │
│  │  ├─ CV 检测结果验证                                             │
│  │  ├─ LLM 输出解析验证                                            │
│  │  └─ 向量相似度计算验证                                          │
│  │                                                                 │
└────────────────────────────────────────────────────────────────────┘
```

---

## 4. 后端测试策略

### 4.1 测试框架选型

| 用途 | 工具 | 说明 |
| --- | --- | --- |
| **测试框架** | pytest | Python 标准，插件生态丰富 |
| **异步测试** | pytest-asyncio | FastAPI 异步支持 |
| **覆盖率** | pytest-cov | 覆盖率报告 |
| **Mock** | unittest.mock / pytest-mock | 依赖模拟 |
| **Fixture** | pytest fixtures | 测试数据管理 |
| **参数化** | @pytest.mark.parametrize | 表格驱动测试 |
| **容器化测试** | TestContainers | DB/Redis/MQ 隔离 |
| **API 测试** | httpx + TestClient | FastAPI 内置 |
| **Celery 测试** | celery.contrib.testing | 任务测试 |
| **快照测试** | syrupy | JSON 输出验证 |

### 4.2 目录结构

```
backend/
├── tests/
│   ├── __init__.py
│   ├── conftest.py              # 全局 fixtures
│   │
│   ├── unit/                    # 单元测试 (75%)
│   │   ├── __init__.py
│   │   ├── test_validators.py   # 数据验证器
│   │   ├── test_utils.py        # 工具函数
│   │   ├── test_rule_engine.py  # 规则引擎
│   │   ├── test_timestamp.py    # 时间戳对齐
│   │   ├── test_brief_parser.py # Brief 解析逻辑
│   │   └── services/
│   │       ├── test_auth.py
│   │       ├── test_brief.py
│   │       ├── test_video.py
│   │       └── test_report.py
│   │
│   ├── integration/             # 集成测试 (20%)
│   │   ├── __init__.py
│   │   ├── conftest.py          # DB/Redis fixtures
│   │   ├── test_api_auth.py     # 认证 API
│   │   ├── test_api_brief.py    # Brief API
│   │   ├── test_api_video.py    # 视频 API
│   │   ├── test_api_report.py   # 报告 API
│   │   ├── test_db_models.py    # 数据库模型
│   │   ├── test_celery_tasks.py # 异步任务
│   │   └── test_websocket.py    # WebSocket
│   │
│   ├── ai/                      # AI 模型测试
│   │   ├── __init__.py
│   │   ├── conftest.py          # 测试集加载
│   │   ├── test_asr.py          # ASR 输出验证
│   │   ├── test_ocr.py          # OCR 输出验证
│   │   ├── test_cv.py           # CV 检测验证
│   │   ├── test_llm.py          # LLM 输出解析
│   │   └── test_embedding.py    # 向量生成验证
│   │
│   ├── e2e/                     # 端到端测试 (5%)
│   │   ├── __init__.py
│   │   └── test_workflows.py    # 完整流程
│   │
│   └── fixtures/                # 测试数据
│       ├── briefs/              # 测试 Brief 文件
│       ├── videos/              # 测试视频文件
│       ├── rules/               # 测试规则集
│       └── snapshots/           # 快照数据
│
├── pytest.ini                   # pytest 配置
└── pyproject.toml               # 项目配置
```

### 4.3 核心测试用例设计

#### 4.3.1 规则引擎测试 (表格驱动)

```python
# tests/unit/test_rule_engine.py

import pytest
from app.services.rule_engine import RuleEngine

class TestProhibitedWordDetection:
    """违禁词检测测试 - 表格驱动"""

    @pytest.mark.parametrize("text,expected_violations,context", [
        # 广告语境下的违禁词 - 应检出
        ("这是全网销量第一的产品", ["全网第一"], "advertisement"),
        ("我们是行业领导者", ["行业领导者"], "advertisement"),
        ("史上最低价促销", ["史上最低价"], "advertisement"),

        # 日常语境下的相同词 - 不应检出
        ("今天是我最开心的一天", [], "daily_conversation"),
        ("这是我第一次来这里", [], "daily_conversation"),

        # 边界情况
        ("", [], "advertisement"),
        ("这是一个普通的产品介绍", [], "advertisement"),

        # 组合违禁词
        ("全网销量第一，史上最低价", ["全网第一", "史上最低价"], "advertisement"),
    ])
    def test_prohibited_word_detection(self, text, expected_violations, context):
        """验证违禁词检测的准确性"""
        engine = RuleEngine()
        result = engine.detect_prohibited_words(text, context=context)

        assert set(result.violations) == set(expected_violations)
```

#### 4.3.2 时间戳对齐算法测试

```python
# tests/unit/test_timestamp.py

import pytest
from app.utils.timestamp_align import TimestampAligner

class TestMultiModalAlignment:
    """多模态时间戳对齐测试"""

    @pytest.fixture
    def aligner(self):
        return TimestampAligner(tolerance_ms=500)

    @pytest.mark.parametrize("asr_ts,ocr_ts,cv_ts,expected_merged", [
        # 完全对齐
        (1000, 1000, 1000, 1000),
        # 容差范围内对齐
        (1000, 1200, 1100, 1100),  # 取中位数
        # 超出容差
        (1000, 2000, 3000, None),  # 不合并
    ])
    def test_timestamp_alignment(self, aligner, asr_ts, ocr_ts, cv_ts, expected_merged):
        """验证时间戳对齐逻辑"""
        events = [
            {"source": "asr", "timestamp_ms": asr_ts, "content": "test"},
            {"source": "ocr", "timestamp_ms": ocr_ts, "content": "test"},
            {"source": "cv", "timestamp_ms": cv_ts, "content": "logo_detected"},
        ]

        merged = aligner.merge_events(events)

        if expected_merged:
            assert len(merged) == 1
            assert merged[0]["timestamp_ms"] == expected_merged
        else:
            assert len(merged) == 3  # 未合并

    def test_duration_calculation_accuracy(self, aligner):
        """验证时长统计误差 ≤ 0.5秒"""
        events = [
            {"timestamp_ms": 0, "type": "product_appear"},
            {"timestamp_ms": 5500, "type": "product_disappear"},
        ]

        duration = aligner.calculate_duration(events)

        # 误差应 ≤ 500ms
        assert abs(duration - 5500) <= 500
```

#### 4.3.3 Brief 解析测试

```python
# tests/unit/test_brief_parser.py

import pytest
from app.services.brief_parser import BriefParser

class TestBriefParsing:
    """Brief 解析逻辑测试"""

    @pytest.fixture
    def parser(self):
        return BriefParser()

    def test_extract_selling_points(self, parser):
        """验证卖点提取"""
        brief_content = """
        产品核心卖点：
        1. 24小时持妆
        2. 天然成分
        3. 敏感肌适用
        """

        result = parser.extract_selling_points(brief_content)

        assert "24小时持妆" in result.selling_points
        assert "天然成分" in result.selling_points
        assert "敏感肌适用" in result.selling_points

    def test_extract_prohibited_words(self, parser):
        """验证禁忌词提取"""
        brief_content = """
        禁止使用的词汇：
        - 药用
        - 治疗
        - 根治
        """

        result = parser.extract_prohibited_words(brief_content)

        assert set(result.prohibited_words) == {"药用", "治疗", "根治"}

    def test_conflict_detection(self, parser):
        """验证 Brief 与平台规则冲突检测"""
        brief_rules = {"allowed_words": ["最佳效果"]}
        platform_rules = {"prohibited_words": ["最佳"]}

        conflicts = parser.detect_conflicts(brief_rules, platform_rules)

        assert len(conflicts) == 1
        assert "最佳效果" in conflicts[0]["conflicting_term"]
```

### 4.4 集成测试策略

#### 4.4.1 数据库集成测试

```python
# tests/integration/conftest.py

import pytest
from testcontainers.postgres import PostgresContainer
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

@pytest.fixture(scope="session")
def postgres_container():
    """启动 PostgreSQL 测试容器"""
    with PostgresContainer("postgres:15-alpine") as postgres:
        yield postgres

@pytest.fixture(scope="function")
def db_session(postgres_container):
    """每个测试函数独立的数据库会话"""
    engine = create_engine(postgres_container.get_connection_url())
    Session = sessionmaker(bind=engine)
    session = Session()

    # 创建表
    Base.metadata.create_all(engine)

    yield session

    # 清理
    session.rollback()
    session.close()
```

#### 4.4.2 API 集成测试

```python
# tests/integration/test_api_brief.py

import pytest
from httpx import AsyncClient
from app.main import app

class TestBriefAPI:
    """Brief API 集成测试"""

    @pytest.fixture
    async def client(self):
        async with AsyncClient(app=app, base_url="http://test") as ac:
            yield ac

    @pytest.fixture
    async def auth_headers(self, client):
        """获取认证头"""
        response = await client.post("/auth/login", json={
            "username": "test_agency",
            "password": "password"
        })
        token = response.json()["access_token"]
        return {"Authorization": f"Bearer {token}"}

    async def test_upload_brief_pdf(self, client, auth_headers, tmp_path):
        """测试 Brief PDF 上传"""
        # 准备测试文件
        test_pdf = tmp_path / "test_brief.pdf"
        test_pdf.write_bytes(b"%PDF-1.4 test content")

        with open(test_pdf, "rb") as f:
            response = await client.post(
                "/api/v1/briefs/upload",
                files={"file": ("test.pdf", f, "application/pdf")},
                headers=auth_headers
            )

        assert response.status_code == 202
        assert "task_id" in response.json()

    async def test_get_brief_parsing_result(self, client, auth_headers):
        """测试获取 Brief 解析结果"""
        # 假设已有解析完成的 Brief
        brief_id = "test-brief-id"

        response = await client.get(
            f"/api/v1/briefs/{brief_id}",
            headers=auth_headers
        )

        assert response.status_code == 200
        result = response.json()
        assert "selling_points" in result
        assert "prohibited_words" in result
```

### 4.5 Celery 异步任务测试

```python
# tests/integration/test_celery_tasks.py

import pytest
from unittest.mock import patch, MagicMock
from app.tasks.video_auditing import audit_video_task

class TestVideoAuditingTask:
    """视频审核异步任务测试"""

    @pytest.fixture
    def mock_ai_services(self):
        """Mock 所有 AI 服务"""
        with patch("app.tasks.video_auditing.ASRService") as mock_asr, \
             patch("app.tasks.video_auditing.OCRService") as mock_ocr, \
             patch("app.tasks.video_auditing.CVService") as mock_cv:

            mock_asr.return_value.transcribe.return_value = {
                "text": "这是测试文本",
                "timestamps": [{"start": 0, "end": 1000, "text": "这是测试文本"}]
            }

            mock_ocr.return_value.extract.return_value = {
                "frames": [{"timestamp": 500, "text": "字幕内容"}]
            }

            mock_cv.return_value.detect.return_value = {
                "logos": [],
                "objects": [{"timestamp": 500, "object": "product"}]
            }

            yield {"asr": mock_asr, "ocr": mock_ocr, "cv": mock_cv}

    def test_video_audit_task_success(self, mock_ai_services, db_session):
        """测试视频审核任务成功执行"""
        task_id = "test-task-id"
        video_url = "https://test.oss.com/test.mp4"
        brief_id = "test-brief-id"

        # 使用 eager 模式同步执行
        result = audit_video_task.apply(
            args=[task_id, video_url, brief_id]
        ).get()

        assert result["status"] == "completed"
        assert "report" in result
        assert "risk_items" in result["report"]

    def test_video_audit_task_with_violations(self, mock_ai_services, db_session):
        """测试检测到违规时的处理"""
        # 修改 Mock 返回值，模拟检测到违禁词
        mock_ai_services["asr"].return_value.transcribe.return_value = {
            "text": "这是全网销量第一的产品",
            "timestamps": [{"start": 0, "end": 1000, "text": "这是全网销量第一的产品"}]
        }

        result = audit_video_task.apply(
            args=["test-task", "url", "brief-id"]
        ).get()

        assert result["status"] == "completed"
        assert len(result["report"]["risk_items"]) > 0
        assert any(
            item["type"] == "prohibited_word"
            for item in result["report"]["risk_items"]
        )
```

---

## 5. 前端测试策略

### 5.1 测试框架选型

| 用途 | 工具 | 说明 |
| --- | --- | --- |
| **单元测试框架** | Vitest | Vite 原生，极速执行 |
| **组件测试** | @testing-library/react | 用户行为驱动 |
| **DOM 断言** | @testing-library/jest-dom | 扩展匹配器 |
| **Mock 服务** | MSW (Mock Service Worker) | API 拦截 |
| **E2E 测试** | Playwright | 跨浏览器 |
| **视觉回归** | Percy / Chromatic | 截图对比 |
| **覆盖率** | @vitest/coverage-v8 | 覆盖率报告 |

### 5.2 目录结构

```
frontend/
├── src/
│   ├── components/
│   │   ├── Button/
│   │   │   ├── Button.tsx
│   │   │   ├── Button.test.tsx        # 组件测试
│   │   │   └── Button.stories.tsx     # Storybook (可选)
│   │   └── ...
│   │
│   ├── hooks/
│   │   ├── useAuth.ts
│   │   ├── useAuth.test.ts            # Hook 测试
│   │   └── ...
│   │
│   ├── services/
│   │   ├── api.ts
│   │   ├── api.test.ts                # 服务测试
│   │   └── ...
│   │
│   ├── store/
│   │   ├── auth.ts
│   │   ├── auth.test.ts               # 状态测试
│   │   └── ...
│   │
│   └── lib/
│       ├── utils.ts
│       ├── utils.test.ts              # 工具函数测试
│       └── ...
│
├── tests/
│   ├── setup.ts                       # 测试全局配置
│   ├── mocks/
│   │   ├── handlers.ts                # MSW 处理器
│   │   └── server.ts                  # MSW 服务器
│   │
│   ├── integration/                   # 集成测试
│   │   ├── BriefUpload.test.tsx
│   │   ├── VideoUpload.test.tsx
│   │   └── ReviewDashboard.test.tsx
│   │
│   └── e2e/                           # Playwright E2E
│       ├── creator-flow.spec.ts
│       ├── agency-flow.spec.ts
│       └── brand-flow.spec.ts
│
├── vitest.config.ts                   # Vitest 配置
├── playwright.config.ts               # Playwright 配置
└── package.json
```

### 5.3 单元测试示例

#### 5.3.1 工具函数测试

```typescript
// src/lib/utils.test.ts

import { describe, it, expect } from 'vitest'
import {
  formatDuration,
  formatTimestamp,
  truncateText,
  validateVideoFile
} from './utils'

describe('formatDuration', () => {
  it('格式化秒数为 mm:ss', () => {
    expect(formatDuration(65)).toBe('01:05')
    expect(formatDuration(3661)).toBe('61:01')
    expect(formatDuration(0)).toBe('00:00')
  })

  it('处理负数', () => {
    expect(formatDuration(-10)).toBe('00:00')
  })
})

describe('formatTimestamp', () => {
  it('格式化毫秒为 HH:MM:SS.mmm', () => {
    expect(formatTimestamp(1500)).toBe('00:00:01.500')
    expect(formatTimestamp(3661500)).toBe('01:01:01.500')
  })
})

describe('validateVideoFile', () => {
  it('接受有效的 MP4 文件', () => {
    const file = new File([''], 'test.mp4', { type: 'video/mp4' })
    Object.defineProperty(file, 'size', { value: 50 * 1024 * 1024 }) // 50MB

    const result = validateVideoFile(file)

    expect(result.valid).toBe(true)
  })

  it('拒绝超过 100MB 的文件', () => {
    const file = new File([''], 'test.mp4', { type: 'video/mp4' })
    Object.defineProperty(file, 'size', { value: 150 * 1024 * 1024 }) // 150MB

    const result = validateVideoFile(file)

    expect(result.valid).toBe(false)
    expect(result.error).toContain('100MB')
  })

  it('拒绝非视频格式', () => {
    const file = new File([''], 'test.pdf', { type: 'application/pdf' })

    const result = validateVideoFile(file)

    expect(result.valid).toBe(false)
    expect(result.error).toContain('格式')
  })
})
```

#### 5.3.2 React Hook 测试

```typescript
// src/hooks/useAuth.test.ts

import { renderHook, act, waitFor } from '@testing-library/react'
import { describe, it, expect, beforeEach, vi } from 'vitest'
import { useAuth } from './useAuth'
import { server } from '../tests/mocks/server'
import { rest } from 'msw'

describe('useAuth', () => {
  beforeEach(() => {
    localStorage.clear()
  })

  it('初始状态为未登录', () => {
    const { result } = renderHook(() => useAuth())

    expect(result.current.isAuthenticated).toBe(false)
    expect(result.current.user).toBeNull()
  })

  it('登录成功后更新状态', async () => {
    const { result } = renderHook(() => useAuth())

    await act(async () => {
      await result.current.login('test@example.com', 'password')
    })

    await waitFor(() => {
      expect(result.current.isAuthenticated).toBe(true)
      expect(result.current.user?.email).toBe('test@example.com')
    })
  })

  it('登录失败时抛出错误', async () => {
    // 模拟 API 返回错误
    server.use(
      rest.post('/api/auth/login', (req, res, ctx) => {
        return res(ctx.status(401), ctx.json({ error: '密码错误' }))
      })
    )

    const { result } = renderHook(() => useAuth())

    await expect(
      act(async () => {
        await result.current.login('test@example.com', 'wrong')
      })
    ).rejects.toThrow('密码错误')
  })

  it('登出后清除状态', async () => {
    const { result } = renderHook(() => useAuth())

    // 先登录
    await act(async () => {
      await result.current.login('test@example.com', 'password')
    })

    // 再登出
    act(() => {
      result.current.logout()
    })

    expect(result.current.isAuthenticated).toBe(false)
    expect(result.current.user).toBeNull()
  })
})
```

#### 5.3.3 Zustand 状态测试

```typescript
// src/store/upload.test.ts

import { describe, it, expect, beforeEach } from 'vitest'
import { useUploadStore } from './upload'

describe('useUploadStore', () => {
  beforeEach(() => {
    // 重置 store
    useUploadStore.setState({
      files: [],
      uploadProgress: {},
      isUploading: false,
    })
  })

  it('添加文件到上传队列', () => {
    const file = new File(['test'], 'test.mp4', { type: 'video/mp4' })

    useUploadStore.getState().addFile(file)

    expect(useUploadStore.getState().files).toHaveLength(1)
    expect(useUploadStore.getState().files[0].name).toBe('test.mp4')
  })

  it('更新上传进度', () => {
    const fileId = 'file-123'

    useUploadStore.getState().updateProgress(fileId, 50)

    expect(useUploadStore.getState().uploadProgress[fileId]).toBe(50)
  })

  it('移除已完成的文件', () => {
    const file = new File(['test'], 'test.mp4', { type: 'video/mp4' })
    useUploadStore.getState().addFile(file)

    const fileId = useUploadStore.getState().files[0].id
    useUploadStore.getState().removeFile(fileId)

    expect(useUploadStore.getState().files).toHaveLength(0)
  })
})
```

#### 5.3.4 组件测试

```typescript
// src/components/video/VideoUpload.test.tsx

import { render, screen, fireEvent, waitFor } from '@testing-library/react'
import userEvent from '@testing-library/user-event'
import { describe, it, expect, vi } from 'vitest'
import { VideoUpload } from './VideoUpload'

describe('VideoUpload', () => {
  it('渲染上传区域', () => {
    render(<VideoUpload onUpload={vi.fn()} />)

    expect(screen.getByText(/拖拽视频到此处/)).toBeInTheDocument()
    expect(screen.getByText(/支持 MP4、MOV 格式/)).toBeInTheDocument()
  })

  it('拖拽文件触发上传', async () => {
    const onUpload = vi.fn()
    render(<VideoUpload onUpload={onUpload} />)

    const dropzone = screen.getByTestId('dropzone')
    const file = new File(['video content'], 'test.mp4', { type: 'video/mp4' })

    fireEvent.drop(dropzone, {
      dataTransfer: { files: [file] }
    })

    await waitFor(() => {
      expect(onUpload).toHaveBeenCalledWith(file)
    })
  })

  it('拒绝超大文件并显示错误', async () => {
    const onUpload = vi.fn()
    render(<VideoUpload onUpload={onUpload} maxSize={100 * 1024 * 1024} />)

    const file = new File([''], 'large.mp4', { type: 'video/mp4' })
    Object.defineProperty(file, 'size', { value: 150 * 1024 * 1024 })

    const dropzone = screen.getByTestId('dropzone')
    fireEvent.drop(dropzone, {
      dataTransfer: { files: [file] }
    })

    await waitFor(() => {
      expect(screen.getByText(/文件大小不能超过 100MB/)).toBeInTheDocument()
      expect(onUpload).not.toHaveBeenCalled()
    })
  })

  it('显示上传进度', async () => {
    render(<VideoUpload onUpload={vi.fn()} initialProgress={45} />)

    expect(screen.getByRole('progressbar')).toHaveAttribute('aria-valuenow', '45')
    expect(screen.getByText('45%')).toBeInTheDocument()
  })
})
```

### 5.4 MSW Mock 服务配置

```typescript
// tests/mocks/handlers.ts

import { rest } from 'msw'

export const handlers = [
  // 认证 API
  rest.post('/api/auth/login', async (req, res, ctx) => {
    const { email, password } = await req.json()

    if (password === 'password') {
      return res(ctx.json({
        access_token: 'mock-token',
        user: { id: '1', email, role: 'agency' }
      }))
    }

    return res(ctx.status(401), ctx.json({ error: '密码错误' }))
  }),

  // Brief API
  rest.get('/api/v1/briefs/:id', (req, res, ctx) => {
    return res(ctx.json({
      id: req.params.id,
      selling_points: ['24小时持妆', '天然成分'],
      prohibited_words: ['药用', '治疗'],
      status: 'parsed'
    }))
  }),

  // 视频上传 API
  rest.post('/api/v1/videos/upload', async (req, res, ctx) => {
    return res(ctx.json({
      task_id: 'mock-task-id',
      status: 'processing'
    }))
  }),

  // WebSocket 模拟
  // 注意：MSW 不支持 WebSocket，需要单独的 mock
]
```

```typescript
// tests/mocks/server.ts

import { setupServer } from 'msw/node'
import { handlers } from './handlers'

export const server = setupServer(...handlers)
```

```typescript
// tests/setup.ts

import { beforeAll, afterEach, afterAll } from 'vitest'
import { server } from './mocks/server'
import '@testing-library/jest-dom'

beforeAll(() => server.listen())
afterEach(() => server.resetHandlers())
afterAll(() => server.close())
```

### 5.5 前端测试自动化方案总结

| 测试类型 | 工具 | 覆盖范围 | 执行时机 |
| --- | --- | --- | --- |
| **单元测试** | Vitest | 工具函数、Hooks、Store | 每次提交 |
| **组件测试** | Vitest + RTL | UI 组件行为 | 每次提交 |
| **集成测试** | Vitest + MSW | 页面级交互 | 每次 PR |
| **E2E 测试** | Playwright | 完整用户流程 | 每日/发布前 |
| **视觉回归** | Percy/Chromatic | UI 外观变化 | 每次 PR |
| **兼容性测试** | BrowserStack | 跨浏览器/设备 | 发布前 |

---

## 6. AI 模型测试策略

### 6.1 AI 测试的特殊性

AI 模型测试与传统单元测试有本质区别：

| 维度 | 传统测试 | AI 模型测试 |
| --- | --- | --- |
| **输出确定性** | 确定性输出 | 概率性输出 |
| **验证方式** | 精确匹配 | 阈值验证 |
| **测试数据** | 少量手工构造 | 大规模标注集 |
| **回归检测** | 断言失败 | 指标下降 |
| **维护成本** | 低 | 需持续更新 |

### 6.2 AI 测试分层

```
┌────────────────────────────────────────────────────────────────┐
│                      AI 模型测试分层                            │
├────────────────────────────────────────────────────────────────┤
│                                                                │
│   【第1层：接口契约测试】                                        │
│   • 输入格式验证                                                │
│   • 输出结构验证                                                │
│   • 错误处理验证                                                │
│   → 使用 pytest + JSON Schema 验证                             │
│                                                                │
│   【第2层：功能正确性测试】                                      │
│   • 标注测试集验证                                              │
│   • 边界情况覆盖                                                │
│   • 阈值达标检查                                                │
│   → 使用标注数据 + 指标计算                                     │
│                                                                │
│   【第3层：回归测试】                                            │
│   • 模型更新后的指标对比                                        │
│   • Prompt 修改后的行为验证                                     │
│   • 新增 Case 的持续覆盖                                        │
│   → 使用 MLflow + 版本对比                                      │
│                                                                │
│   【第4层：对抗测试】                                            │
│   • 边缘输入 (长文本、特殊字符、空输入)                          │
│   • 对抗样本 (刻意绕过检测)                                      │
│   • 压力测试 (高并发、大文件)                                    │
│   → 使用 fuzzing + 人工设计                                     │
│                                                                │
└────────────────────────────────────────────────────────────────┘
```

### 6.3 各 AI 模块测试策略

#### 6.3.1 ASR 语音识别测试

```python
# tests/ai/test_asr.py

import pytest
from app.ai.asr import ASRService

class TestASRService:
    """ASR 语音识别测试"""

    @pytest.fixture
    def asr(self):
        return ASRService()

    @pytest.fixture
    def test_audio_samples(self):
        """加载标注测试集"""
        return load_labeled_dataset("tests/fixtures/asr_samples/")

    def test_output_format(self, asr):
        """验证输出格式契约"""
        result = asr.transcribe("tests/fixtures/sample.wav")

        # 验证必需字段
        assert "text" in result
        assert "timestamps" in result
        assert isinstance(result["timestamps"], list)

        # 验证时间戳格式
        for ts in result["timestamps"]:
            assert "start" in ts
            assert "end" in ts
            assert "text" in ts
            assert ts["end"] >= ts["start"]

    def test_word_error_rate(self, asr, test_audio_samples):
        """验证字错率 ≤ 10%"""
        total_errors = 0
        total_words = 0

        for sample in test_audio_samples:
            result = asr.transcribe(sample["audio_path"])
            wer = calculate_wer(result["text"], sample["ground_truth"])
            total_errors += wer * len(sample["ground_truth"].split())
            total_words += len(sample["ground_truth"].split())

        overall_wer = total_errors / total_words

        assert overall_wer <= 0.10, f"WER {overall_wer:.2%} 超过阈值 10%"

    def test_timestamp_accuracy(self, asr, test_audio_samples):
        """验证时间戳准确性"""
        for sample in test_audio_samples:
            result = asr.transcribe(sample["audio_path"])

            # 验证起止时间与音频时长匹配
            audio_duration = get_audio_duration(sample["audio_path"])
            last_timestamp = result["timestamps"][-1]["end"]

            # 允许 500ms 误差
            assert abs(last_timestamp - audio_duration * 1000) <= 500
```

#### 6.3.2 违禁词检测测试

```python
# tests/ai/test_prohibited_words.py

import pytest
from app.ai.nlp import ProhibitedWordDetector

class TestProhibitedWordDetector:
    """违禁词检测测试"""

    @pytest.fixture
    def detector(self):
        return ProhibitedWordDetector()

    @pytest.fixture
    def labeled_dataset(self):
        """
        标注数据集格式：
        {
            "text": "这是全网销量第一的产品",
            "context": "advertisement",
            "expected_violations": ["全网第一"],
            "should_block": true
        }
        """
        return load_labeled_dataset("tests/fixtures/prohibited_words/")

    def test_recall_rate(self, detector, labeled_dataset):
        """验证召回率 ≥ 95%"""
        true_positives = 0
        false_negatives = 0

        for sample in labeled_dataset:
            if not sample["expected_violations"]:
                continue

            result = detector.detect(sample["text"], sample["context"])
            detected = set(result.violations)
            expected = set(sample["expected_violations"])

            true_positives += len(detected & expected)
            false_negatives += len(expected - detected)

        recall = true_positives / (true_positives + false_negatives)

        assert recall >= 0.95, f"召回率 {recall:.2%} 低于阈值 95%"

    def test_false_positive_rate(self, detector, labeled_dataset):
        """验证误报率 ≤ 5%"""
        false_positives = 0
        true_negatives = 0

        # 只测试不应有违规的样本
        negative_samples = [
            s for s in labeled_dataset
            if not s["expected_violations"]
        ]

        for sample in negative_samples:
            result = detector.detect(sample["text"], sample["context"])

            if result.violations:
                false_positives += 1
            else:
                true_negatives += 1

        fpr = false_positives / (false_positives + true_negatives)

        assert fpr <= 0.05, f"误报率 {fpr:.2%} 超过阈值 5%"

    def test_context_awareness(self, detector):
        """验证语境感知能力"""
        text = "这是我最开心的一天"

        # 广告语境 - 不应误报
        result_ad = detector.detect(text, context="advertisement")
        assert len(result_ad.violations) == 0, "日常用语在广告语境误报"

        # 日常语境 - 不应误报
        result_daily = detector.detect(text, context="daily_conversation")
        assert len(result_daily.violations) == 0, "日常用语误报"
```

#### 6.3.3 Logo 向量检索测试

```python
# tests/ai/test_logo_detection.py

import pytest
from app.ai.cv import LogoDetector

class TestLogoDetector:
    """Logo 检测测试"""

    @pytest.fixture
    def detector(self):
        return LogoDetector()

    @pytest.fixture
    def logo_test_set(self):
        """
        测试集包含：
        - 200+ 竞品 Logo 图片
        - 各种遮挡、模糊、旋转场景
        - 负样本（无 Logo 的图片）
        """
        return load_labeled_dataset("tests/fixtures/logos/")

    def test_f1_score(self, detector, logo_test_set):
        """验证 F1 ≥ 0.85"""
        predictions = []
        ground_truths = []

        for sample in logo_test_set:
            result = detector.detect(sample["image_path"])
            predictions.append(result.detected_logos)
            ground_truths.append(sample["ground_truth_logos"])

        f1 = calculate_f1(predictions, ground_truths)

        assert f1 >= 0.85, f"F1 {f1:.2f} 低于阈值 0.85"

    def test_partial_occlusion(self, detector, logo_test_set):
        """验证 30% 遮挡场景下的检测能力"""
        occluded_samples = [
            s for s in logo_test_set
            if s.get("occlusion_rate", 0) >= 0.3
        ]

        correct = 0
        for sample in occluded_samples:
            result = detector.detect(sample["image_path"])
            if sample["ground_truth_logos"] == result.detected_logos:
                correct += 1

        accuracy = correct / len(occluded_samples)

        # 遮挡场景允许稍低的准确率
        assert accuracy >= 0.75, f"遮挡场景准确率 {accuracy:.2%} 过低"

    def test_new_logo_instant_detection(self, detector):
        """验证新 Logo 上传后即刻生效"""
        # 上传新 Logo
        new_logo_path = "tests/fixtures/new_competitor_logo.png"
        detector.add_logo(new_logo_path, brand="New Competitor")

        # 立即测试检测
        test_frame = "tests/fixtures/frame_with_new_logo.jpg"
        result = detector.detect(test_frame)

        assert "New Competitor" in result.detected_logos
```

### 6.4 LLM Prompt 测试

```python
# tests/ai/test_llm.py

import pytest
from app.ai.llm import LLMService

class TestLLMPrompts:
    """LLM Prompt 测试"""

    @pytest.fixture
    def llm(self):
        return LLMService()

    @pytest.fixture
    def few_shot_examples(self):
        """Few-shot 示例集"""
        return load_few_shot_examples("tests/fixtures/llm_examples/")

    def test_brief_parsing_output_format(self, llm):
        """验证 Brief 解析输出格式"""
        brief_content = """
        产品卖点：24小时持妆
        禁止使用：药用、治疗
        """

        result = llm.parse_brief(brief_content)

        # 验证输出结构
        assert "selling_points" in result
        assert "prohibited_words" in result
        assert isinstance(result["selling_points"], list)
        assert isinstance(result["prohibited_words"], list)

    def test_context_understanding(self, llm, few_shot_examples):
        """验证语境理解能力"""
        context_examples = [
            e for e in few_shot_examples
            if e["type"] == "context_understanding"
        ]

        correct = 0
        for example in context_examples:
            result = llm.classify_context(example["text"])
            if result["context"] == example["expected_context"]:
                correct += 1

        accuracy = correct / len(context_examples)

        assert accuracy >= 0.90, f"语境理解准确率 {accuracy:.2%} 过低"

    def test_sentiment_analysis(self, llm):
        """验证舆情风险检测"""
        test_cases = [
            {"text": "这个产品太油腻了", "expected_risk": "greasy"},
            {"text": "正常的产品介绍", "expected_risk": None},
            {"text": "男人就该这样", "expected_risk": "gender_bias"},
        ]

        for case in test_cases:
            result = llm.analyze_sentiment(case["text"])

            if case["expected_risk"]:
                assert result.risk_type == case["expected_risk"]
            else:
                assert result.risk_type is None
```

### 6.5 AI 测试数据集管理

```
tests/fixtures/
├── asr_samples/                    # ASR 测试集
│   ├── manifest.json               # 数据清单
│   ├── audio/
│   │   ├── sample_001.wav
│   │   └── ...
│   └── transcripts/
│       ├── sample_001.json         # 标注结果
│       └── ...
│
├── prohibited_words/               # 违禁词测试集
│   ├── positive_samples.json       # 应检出样本
│   ├── negative_samples.json       # 不应检出样本
│   └── context_samples.json        # 语境测试样本
│
├── logos/                          # Logo 测试集
│   ├── manifest.json
│   ├── images/
│   │   ├── logo_001.jpg
│   │   └── ...
│   └── annotations/
│       ├── logo_001.json
│       └── ...
│
├── llm_examples/                   # LLM 测试集
│   ├── brief_parsing.json
│   ├── context_understanding.json
│   └── sentiment_analysis.json
│
└── README.md                       # 测试集说明文档
```

---

## 7. 端到端测试策略

### 7.1 E2E 测试框架

| 工具 | 用途 |
| --- | --- |
| **Playwright** | 跨浏览器 E2E 测试 |
| **@playwright/test** | 测试运行器 |
| **playwright-report** | 测试报告 |
| **BrowserStack** | 真机云测试 |

### 7.2 核心用户流程测试

```typescript
// tests/e2e/creator-flow.spec.ts

import { test, expect } from '@playwright/test'

test.describe('达人端完整流程', () => {
  test.beforeEach(async ({ page }) => {
    // 登录达人账号
    await page.goto('/auth/login')
    await page.fill('[name="email"]', 'creator@test.com')
    await page.fill('[name="password"]', 'password')
    await page.click('button[type="submit"]')
    await expect(page).toHaveURL('/creator/tasks')
  })

  test('上传视频 → 等待审核 → 查看结果', async ({ page }) => {
    // 1. 进入上传页面
    await page.click('text=上传')
    await expect(page).toHaveURL('/creator/upload')

    // 2. 上传视频
    const fileInput = page.locator('input[type="file"]')
    await fileInput.setInputFiles('tests/fixtures/test_video.mp4')

    // 3. 等待上传完成
    await expect(page.locator('.upload-progress')).toHaveText(/100%/)

    // 4. 等待审核完成（可能需要等待）
    await page.click('button:has-text("提交审核")')

    // 5. 验证进入审核中状态
    await expect(page.locator('.audit-status')).toHaveText(/审核中/)

    // 6. 等待审核完成（最多 5 分钟）
    await expect(page.locator('.audit-status')).toHaveText(
      /已通过|需修改/,
      { timeout: 300000 }
    )

    // 7. 验证结果页面
    await page.click('text=查看结果')
    await expect(page.locator('.result-banner')).toBeVisible()
  })

  test('申诉流程', async ({ page }) => {
    // 假设有一个需修改的任务
    await page.goto('/creator/tasks?status=needs_revision')
    await page.click('.task-card >> nth=0')

    // 1. 点击申诉按钮
    await page.click('button:has-text("申诉")')

    // 2. 填写申诉理由
    await page.fill('textarea[name="reason"]', '这不是广告用语，是日常表达')

    // 3. 提交申诉
    await page.click('button:has-text("提交申诉")')

    // 4. 验证申诉成功
    await expect(page.locator('.toast')).toHaveText(/申诉已提交/)
  })
})
```

```typescript
// tests/e2e/agency-flow.spec.ts

import { test, expect } from '@playwright/test'

test.describe('代理商端完整流程', () => {
  test.beforeEach(async ({ page }) => {
    // 登录代理商账号
    await page.goto('/auth/login')
    await page.fill('[name="email"]', 'agency@test.com')
    await page.fill('[name="password"]', 'password')
    await page.click('button[type="submit"]')
    await expect(page).toHaveURL('/agency/dashboard')
  })

  test('配置 Brief → 审核视频 → 通过', async ({ page }) => {
    // 1. 上传 Brief
    await page.click('text=Brief 管理')
    await page.click('button:has-text("上传 Brief")')

    const fileInput = page.locator('input[type="file"]')
    await fileInput.setInputFiles('tests/fixtures/test_brief.pdf')

    // 2. 等待解析完成
    await expect(page.locator('.parsing-status')).toHaveText(/解析完成/, {
      timeout: 60000
    })

    // 3. 确认规则
    await page.click('button:has-text("确认规则")')

    // 4. 进入审核台
    await page.click('text=审核台')
    await page.click('.pending-task >> nth=0')

    // 5. 查看视频和检查单
    await expect(page.locator('.video-player')).toBeVisible()
    await expect(page.locator('.checklist')).toBeVisible()

    // 6. 通过审核
    await page.click('button:has-text("通过")')
    await page.click('button:has-text("确认")')

    // 7. 验证状态更新
    await expect(page.locator('.task-status')).toHaveText(/已通过/)
  })

  test('驳回视频', async ({ page }) => {
    await page.goto('/agency/review')
    await page.click('.pending-task >> nth=0')

    // 勾选问题
    await page.check('input[name="issue_0"]')
    await page.check('input[name="issue_1"]')

    // 驳回
    await page.click('button:has-text("驳回")')
    await page.click('button:has-text("确认")')

    await expect(page.locator('.task-status')).toHaveText(/已驳回/)
  })
})
```

### 7.3 移动端 E2E 测试

```typescript
// tests/e2e/mobile-creator.spec.ts

import { test, expect, devices } from '@playwright/test'

test.use({
  ...devices['iPhone 13'],
})

test.describe('达人端 H5 移动端测试', () => {
  test('移动端上传视频', async ({ page }) => {
    await page.goto('/creator/upload')

    // 验证移动端布局
    await expect(page.locator('.bottom-nav')).toBeVisible()

    // 验证防锁屏提示
    await expect(page.locator('.wakelock-hint')).toBeVisible()

    // 模拟上传
    const fileInput = page.locator('input[type="file"]')
    await fileInput.setInputFiles('tests/fixtures/test_video.mp4')

    // 验证进度显示
    await expect(page.locator('.circular-progress')).toBeVisible()
  })
})
```

### 7.4 Playwright 配置

```typescript
// playwright.config.ts

import { defineConfig, devices } from '@playwright/test'

export default defineConfig({
  testDir: './tests/e2e',
  timeout: 60000,
  retries: 2,
  workers: 4,

  reporter: [
    ['html', { outputFolder: 'playwright-report' }],
    ['junit', { outputFile: 'test-results/junit.xml' }],
  ],

  use: {
    baseURL: process.env.TEST_BASE_URL || 'http://localhost:3000',
    trace: 'on-first-retry',
    screenshot: 'only-on-failure',
    video: 'on-first-retry',
  },

  projects: [
    // 桌面浏览器
    { name: 'chromium', use: { ...devices['Desktop Chrome'] } },
    { name: 'firefox', use: { ...devices['Desktop Firefox'] } },
    { name: 'webkit', use: { ...devices['Desktop Safari'] } },

    // 移动端
    { name: 'mobile-chrome', use: { ...devices['Pixel 5'] } },
    { name: 'mobile-safari', use: { ...devices['iPhone 13'] } },
  ],

  // 本地开发服务器
  webServer: {
    command: 'npm run dev',
    url: 'http://localhost:3000',
    reuseExistingServer: !process.env.CI,
  },
})
```

---

## 8. 实施路线图

### 8.1 分阶段实施计划

```
┌──────────────────────────────────────────────────────────────────────┐
│                     TDD 实施路线图 (11 周)                            │
├──────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  Phase 0: 基础建设 (Week 0, 并行进行)                                │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │ • 搭建 pytest/Vitest 测试框架                                   │ │
│  │ • 配置 CI/CD 测试流水线                                        │ │
│  │ • 建立代码覆盖率门禁                                           │ │
│  │ • 准备 AI 测试数据集 (初始 100+ 样本)                          │ │
│  │ • 团队 TDD 培训                                                │ │
│  └────────────────────────────────────────────────────────────────┘ │
│                              ↓                                       │
│  Phase 1: 基础设施 + Brief 引擎 (Week 1-2)                          │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │ 后端 TDD (100% 覆盖):                                          │ │
│  │ • 数据模型测试 → 数据模型实现                                   │ │
│  │ • 验证器测试 → 验证器实现                                       │ │
│  │ • 规则引擎测试 → 规则引擎实现                                   │ │
│  │ • Brief 解析测试 → Brief 解析实现                               │ │
│  │                                                                 │ │
│  │ 前端 TDD (基础组件):                                            │ │
│  │ • 工具函数测试 → 工具函数实现                                   │ │
│  │ • 基础组件测试 → 基础组件实现                                   │ │
│  │                                                                 │ │
│  │ API Mock 服务搭建                                               │ │
│  └────────────────────────────────────────────────────────────────┘ │
│                              ↓                                       │
│  Phase 2: 核心 AI 流水线 (Week 3-6)                                 │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │ AI 模型测试 (标注集验证):                                       │ │
│  │ • 建立 ASR/OCR/CV 测试集 (≥500 样本)                           │ │
│  │ • 接口契约测试 → AI 服务封装                                    │ │
│  │ • 阈值验证测试 → 模型调优                                       │ │
│  │                                                                 │ │
│  │ 核心算法 TDD:                                                   │ │
│  │ • 时间戳对齐测试 → 对齐算法实现                                 │ │
│  │ • 多模态融合测试 → 融合逻辑实现                                 │ │
│  │                                                                 │ │
│  │ 集成测试:                                                       │ │
│  │ • Celery 任务测试                                               │ │
│  │ • WebSocket 推送测试                                            │ │
│  └────────────────────────────────────────────────────────────────┘ │
│                              ↓                                       │
│  Phase 3: 界面开发 (Week 7-9)                                       │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │ 前端组件 TDD:                                                   │ │
│  │ • 组件测试 → 组件实现                                           │ │
│  │ • Hook 测试 → Hook 实现                                         │ │
│  │ • Store 测试 → Store 实现                                       │ │
│  │                                                                 │ │
│  │ 页面集成测试:                                                   │ │
│  │ • MSW Mock + 页面交互测试                                       │ │
│  │                                                                 │ │
│  │ E2E 测试骨架:                                                   │ │
│  │ • 核心用户流程 E2E (Playwright)                                 │ │
│  └────────────────────────────────────────────────────────────────┘ │
│                              ↓                                       │
│  Phase 4: 联调与验收 (Week 10-11)                                   │
│  ┌────────────────────────────────────────────────────────────────┐ │
│  │ 测试完善:                                                       │ │
│  │ • E2E 测试补全                                                  │ │
│  │ • 性能测试 (Locust)                                             │ │
│  │ • 兼容性测试 (BrowserStack)                                     │ │
│  │                                                                 │ │
│  │ AI 模型验收:                                                    │ │
│  │ • 完整测试集运行                                                │ │
│  │ • 指标达标验证                                                  │ │
│  │                                                                 │ │
│  │ 回归测试:                                                       │ │
│  │ • 全量回归                                                      │ │
│  │ • 冒烟测试自动化                                                │ │
│  └────────────────────────────────────────────────────────────────┘ │
│                                                                      │
└──────────────────────────────────────────────────────────────────────┘
```

### 8.2 每周测试交付物

| 周次 | 测试交付物 | 覆盖率目标 |
| --- | --- | --- |
| Week 1 | 后端框架测试、数据模型测试 | 后端 80% |
| Week 2 | Brief 解析测试、规则引擎测试 | 后端 80% |
| Week 3 | ASR/OCR 接口测试 | AI 模块 60% |
| Week 4 | CV 检测测试、向量检索测试 | AI 模块 70% |
| Week 5 | 时间戳对齐测试、多模态融合测试 | AI 模块 80% |
| Week 6 | Celery 任务测试、WebSocket 测试 | 后端 85% |
| Week 7 | 前端工具函数测试、Hook 测试 | 前端 60% |
| Week 8 | 前端组件测试、Store 测试 | 前端 70% |
| Week 9 | 页面集成测试、E2E 骨架 | 前端 75% |
| Week 10 | E2E 补全、性能测试 | E2E 核心路径 100% |
| Week 11 | 兼容性测试、全量回归 | 整体 75% |

---

## 9. 测试覆盖率目标

### 9.1 覆盖率门禁

| 层级 | 目标覆盖率 | 门禁策略 |
| --- | --- | --- |
| **后端单元测试** | ≥ 80% | PR 阻断 |
| **前端单元测试** | ≥ 70% | PR 阻断 |
| **AI 模块测试** | ≥ 70% | PR 阻断 |
| **集成测试** | ≥ 60% | PR 警告 |
| **E2E 测试** | 核心路径 100% | 发布阻断 |

### 9.2 覆盖率例外

以下代码可豁免覆盖率要求：

| 代码类型 | 原因 |
| --- | --- |
| 第三方 SDK 封装 | 信任上游 |
| 环境配置代码 | 运行时验证 |
| 日志/监控代码 | 非核心逻辑 |
| 迁移脚本 | 一次性执行 |

### 9.3 覆盖率报告

```yaml
# .github/workflows/test.yml (覆盖率报告部分)

- name: Upload coverage to Codecov
  uses: codecov/codecov-action@v3
  with:
    files: ./coverage/coverage.xml
    fail_ci_if_error: true

- name: Coverage Gate
  run: |
    COVERAGE=$(cat coverage/coverage.txt | grep "TOTAL" | awk '{print $4}' | tr -d '%')
    if [ "$COVERAGE" -lt "75" ]; then
      echo "Coverage $COVERAGE% is below threshold 75%"
      exit 1
    fi
```

---

## 10. 工具链配置

### 10.1 后端工具链

```toml
# pyproject.toml

[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = ["test_*.py"]
python_functions = ["test_*"]
addopts = [
    "-v",
    "--tb=short",
    "--strict-markers",
    "-ra",
    "--cov=app",
    "--cov-report=xml",
    "--cov-report=html",
    "--cov-fail-under=75",
]
asyncio_mode = "auto"
markers = [
    "slow: 标记慢速测试",
    "integration: 集成测试",
    "ai: AI 模型测试",
]

[tool.coverage.run]
branch = true
source = ["app"]
omit = [
    "*/migrations/*",
    "*/tests/*",
    "*/__init__.py",
]

[tool.coverage.report]
exclude_lines = [
    "pragma: no cover",
    "def __repr__",
    "raise NotImplementedError",
    "if TYPE_CHECKING:",
]
```

### 10.2 前端工具链

```typescript
// vitest.config.ts

import { defineConfig } from 'vitest/config'
import react from '@vitejs/plugin-react'
import path from 'path'

export default defineConfig({
  plugins: [react()],
  test: {
    globals: true,
    environment: 'jsdom',
    setupFiles: ['./tests/setup.ts'],
    include: ['**/*.test.{ts,tsx}'],
    coverage: {
      provider: 'v8',
      reporter: ['text', 'json', 'html'],
      exclude: [
        'node_modules/',
        'tests/',
        '**/*.d.ts',
        '**/*.config.*',
      ],
      thresholds: {
        lines: 70,
        functions: 70,
        branches: 70,
        statements: 70,
      },
    },
  },
  resolve: {
    alias: {
      '@': path.resolve(__dirname, './src'),
    },
  },
})
```

### 10.3 CI/CD 配置

```yaml
# .github/workflows/test.yml

name: Test Suite

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main, develop]

jobs:
  backend-test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: test
        ports:
          - 5432:5432
      redis:
        image: redis:7
        ports:
          - 6379:6379

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
          cache: 'pip'

      - name: Install dependencies
        run: |
          cd backend
          pip install -r requirements.txt
          pip install -r requirements-dev.txt

      - name: Run linting
        run: |
          cd backend
          ruff check .
          mypy app

      - name: Run tests
        run: |
          cd backend
          pytest --cov --cov-report=xml
        env:
          DATABASE_URL: postgresql://postgres:test@localhost/test
          REDIS_URL: redis://localhost:6379

      - name: Upload coverage
        uses: codecov/codecov-action@v3
        with:
          files: backend/coverage.xml

  frontend-test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
          cache-dependency-path: frontend/package-lock.json

      - name: Install dependencies
        run: |
          cd frontend
          npm ci

      - name: Run linting
        run: |
          cd frontend
          npm run lint

      - name: Run tests
        run: |
          cd frontend
          npm run test:coverage

      - name: Upload coverage
        uses: codecov/codecov-action@v3
        with:
          files: frontend/coverage/coverage-final.json

  e2e-test:
    runs-on: ubuntu-latest
    needs: [backend-test, frontend-test]

    steps:
      - uses: actions/checkout@v4

      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install Playwright
        run: |
          cd frontend
          npm ci
          npx playwright install --with-deps

      - name: Run E2E tests
        run: |
          cd frontend
          npm run test:e2e

      - name: Upload Playwright report
        uses: actions/upload-artifact@v3
        if: failure()
        with:
          name: playwright-report
          path: frontend/playwright-report/
```

---

## 11. 团队规范与培训

### 11.1 TDD 工作流规范

```
┌────────────────────────────────────────────────────────────────────┐
│                       TDD 红-绿-重构循环                            │
├────────────────────────────────────────────────────────────────────┤
│                                                                    │
│   ┌─────────────┐                                                  │
│   │   🔴 RED    │  1. 编写一个失败的测试                           │
│   │   (失败)    │     • 测试必须能运行                             │
│   │             │     • 测试必须失败                               │
│   └──────┬──────┘     • 失败原因是功能未实现                       │
│          │                                                         │
│          ▼                                                         │
│   ┌─────────────┐                                                  │
│   │  🟢 GREEN   │  2. 编写最少的代码让测试通过                     │
│   │   (通过)    │     • 不要过度设计                               │
│   │             │     • 只写足够通过测试的代码                     │
│   └──────┬──────┘     • 可以"作弊"（硬编码）                       │
│          │                                                         │
│          ▼                                                         │
│   ┌─────────────┐                                                  │
│   │ 🔄 REFACTOR │  3. 重构代码                                     │
│   │   (重构)    │     • 移除重复                                   │
│   │             │     • 改善设计                                   │
│   └──────┬──────┘     • 测试仍然通过                               │
│          │                                                         │
│          └──────────────────────────────────────────────────────►  │
│                            循环                                     │
│                                                                    │
└────────────────────────────────────────────────────────────────────┘
```

### 11.2 测试命名规范

```python
# Python (pytest)

class TestCalculator:
    def test_add_two_positive_numbers_returns_sum(self):
        """测试两个正数相加返回正确的和"""
        pass

    def test_divide_by_zero_raises_error(self):
        """测试除以零抛出错误"""
        pass
```

```typescript
// TypeScript (Vitest)

describe('Calculator', () => {
  it('should return sum when adding two positive numbers', () => {
    // ...
  })

  it('should throw error when dividing by zero', () => {
    // ...
  })
})
```

### 11.3 测试文件组织规范

| 规则 | 说明 |
| --- | --- |
| 测试文件与源文件同目录 | `utils.ts` → `utils.test.ts` |
| 测试目录 `__tests__` | 复杂模块可用目录 |
| 命名后缀 `.test.ts` / `_test.py` | 便于识别和自动发现 |
| 每个测试文件只测一个模块 | 职责单一 |

### 11.4 团队培训计划

| 阶段 | 时长 | 内容 | 交付物 |
| --- | --- | --- | --- |
| **TDD 基础** | 2h | TDD 概念、红绿重构循环 | 培训 PPT |
| **pytest 实战** | 2h | pytest 使用、fixture、参数化 | 示例代码 |
| **Vitest 实战** | 2h | Vitest 使用、RTL、MSW | 示例代码 |
| **AI 测试** | 2h | 标注集管理、阈值验证 | 测试模板 |
| **代码审查** | 持续 | PR 中检查测试质量 | 审查清单 |

### 11.5 代码审查清单

```markdown
## PR 测试审查清单

### 必须项
- [ ] 新功能有对应的单元测试
- [ ] 测试覆盖了正常路径和异常路径
- [ ] 测试命名清晰，描述预期行为
- [ ] 测试独立运行，不依赖执行顺序
- [ ] 覆盖率不低于门禁阈值

### 建议项
- [ ] 使用参数化测试覆盖多种输入
- [ ] Mock 外部依赖，避免测试不稳定
- [ ] 测试执行时间 < 1秒（单元测试）
- [ ] 无硬编码的测试数据（使用 fixture）

### AI 模块特别项
- [ ] 有对应的标注测试集
- [ ] 验证了输出格式
- [ ] 验证了阈值指标
```

---

## 12. 风险与挑战

### 12.1 风险矩阵

| 风险 | 可能性 | 影响 | 缓解措施 |
| --- | --- | --- | --- |
| **AI 模型幻觉** | 🔴 高 | 🔴 高 | 完整标注集 + 人工抽查 + 持续监控 |
| **测试数据不足** | 🟡 中 | 🔴 高 | 持续收集真实数据 + 数据增强 |
| **E2E 测试不稳定** | 🟡 中 | 🟡 中 | 重试机制 + 等待策略优化 |
| **团队 TDD 经验不足** | 🟡 中 | 🟡 中 | 培训 + 结对编程 + 代码审查 |
| **测试维护成本高** | 🟡 中 | 🟡 中 | 测试重构 + 共享 fixture |
| **CI/CD 执行慢** | 🟢 低 | 🟡 中 | 并行执行 + 增量测试 |

### 12.2 AI 测试特殊挑战

| 挑战 | 应对策略 |
| --- | --- |
| **LLM 输出不确定性** | 验证结构而非精确内容 + 多次采样取共识 |
| **Prompt 变更影响大** | 建立 Prompt 版本管理 + 回归测试 |
| **标注成本高** | 优先覆盖高风险场景 + 主动学习采样 |
| **模型更新回归** | 建立基线 + 自动化指标对比 |
| **边缘情况难穷尽** | 对抗样本生成 + 持续收集 badcase |

### 12.3 TDD 常见误区

| 误区 | 正确做法 |
| --- | --- |
| 先写代码再补测试 | 严格遵循红-绿-重构 |
| 追求 100% 覆盖率 | 关注有意义的测试 |
| 测试实现细节 | 测试行为和结果 |
| 过度 Mock | 只 Mock 真正的外部依赖 |
| 测试代码不维护 | 测试代码同样需要重构 |

---

## 13. 附录

### 13.1 相关文档

| 文档 | 说明 |
| --- | --- |
| tasks.md | 开发任务清单 |
| DevelopmentPlan.md | 技术架构与开发计划 |
| FeatureSummary.md | 功能清单与验收标准 |
| User_Role_Interfaces.md | 用户角色与界面规范 |

### 13.2 参考资源

| 资源 | 链接 |
| --- | --- |
| pytest 官方文档 | https://docs.pytest.org/ |
| Vitest 官方文档 | https://vitest.dev/ |
| Testing Library | https://testing-library.com/ |
| Playwright 官方文档 | https://playwright.dev/ |
| MSW 官方文档 | https://mswjs.io/ |
| TDD by Example (书籍) | Kent Beck |

### 13.3 术语表

| 术语 | 定义 |
| --- | --- |
| **TDD** | Test-Driven Development，测试驱动开发 |
| **BDD** | Behavior-Driven Development，行为驱动开发 |
| **SUT** | System Under Test，被测系统 |
| **Fixture** | 测试固定装置，用于准备测试环境 |
| **Mock** | 模拟对象，替代真实依赖 |
| **Stub** | 存根，返回预设值的简化实现 |
| **Coverage** | 代码覆盖率 |
| **Regression** | 回归测试 |

---

## 14. 总结

### 14.1 核心结论

1. **SmartAudit 项目高度适合实施 TDD**
   - 零代码起步，是最佳切入点
   - 需求明确，验收标准量化
   - 技术栈测试生态成熟

2. **采用分层混合 TDD 策略**
   - 业务逻辑：严格 TDD
   - AI 模型：标注集验证
   - E2E：BDD + 自动化

3. **前端测试自动化方案**
   - 单元测试：Vitest + Testing Library
   - 组件测试：RTL + MSW
   - E2E 测试：Playwright
   - 视觉回归：Percy/Chromatic

4. **关键成功因素**
   - 测试框架在 Week 0 搭建完成
   - CI/CD 门禁从第一行代码开始
   - AI 测试集持续积累
   - 团队培训与规范执行

### 14.2 下一步行动

| 优先级 | 行动项 | 负责人 | 时间 |
| --- | --- | --- | --- |
| P0 | 创建 backend/tests/ 目录结构 | Backend Lead | Week 0 |
| P0 | 配置 pytest + CI/CD | Backend Lead | Week 0 |
| P0 | 创建 frontend/tests/ 目录结构 | Frontend Lead | Week 0 |
| P0 | 配置 Vitest + Playwright | Frontend Lead | Week 0 |
| P0 | 团队 TDD 培训 | Tech Lead | Week 0 |
| P1 | 建立 AI 测试数据集框架 | AI Engineer | Week 1 |
| P1 | 编写核心模块测试规范文档 | Tech Lead | Week 1 |

---

**文档状态**：✅ 完成
**下次审阅**：开发启动后 2 周