chore: add project docs baseline

This commit is contained in:
admin123 2026-04-20 19:45:00 +08:00
commit b04d018840
3 changed files with 1268 additions and 0 deletions

4
.gitignore vendored Normal file
View File

@ -0,0 +1,4 @@
.worktrees/
.old-reference/
dist/
node_modules/

View File

@ -0,0 +1,800 @@
# Star Chart Market After-Search Rate Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Build a new Chrome MV3 extension that adds two after-search-rate columns to the Xingtu creator market page, then supports plugin-owned filtering, sorting, full-scan export, and CSV download.
**Architecture:** Start from a minimal MV3 content-script extension and keep the implementation split into small modules: value normalization, API mapping, result storage, DOM sync, full-scan orchestration, filter/sort control, and CSV export. Follow TDD for every production behavior module; only project bootstrap and config files may be created directly as setup.
**Tech Stack:** TypeScript, Chrome Manifest V3, Vitest, jsdom, tsup, Node.js
---
## File Structure
Planned initial structure:
```text
package.json
package-lock.json
tsconfig.json
vitest.config.ts
scripts/
build.mjs
src/
manifest.json
content/
index.ts
market/
index.ts
types.ts
dom-sync.ts
api-client.ts
result-store.ts
filter-sort-controller.ts
full-scan-controller.ts
csv-exporter.ts
plugin-toolbar.ts
shared/
rate-normalizer.ts
csv.ts
tests/
rate-normalizer.test.ts
market-api-client.test.ts
result-store.test.ts
filter-sort-controller.test.ts
market-dom-sync.test.ts
full-scan-controller.test.ts
csv-exporter.test.ts
market-content-entry.test.ts
```
Responsibilities:
- `src/shared/rate-normalizer.ts`: parse and normalize display values and comparable lower-bound values
- `src/content/market/api-client.ts`: fetch and map `/gw/api/aggregator/get_author_ase_info`
- `src/content/market/result-store.ts`: hold per-author status and merged market records
- `src/content/market/filter-sort-controller.ts`: apply threshold filters and lower-bound sorting
- `src/content/market/dom-sync.ts`: locate market rows, inject two columns, and update row visibility/order
- `src/content/market/full-scan-controller.ts`: paginate through filtered result pages and hydrate the result store
- `src/content/market/csv-exporter.ts`: generate CSV rows and blob/download metadata
- `src/content/market/plugin-toolbar.ts`: render the plugin controls for threshold filter, sorting, and export
- `src/content/market/index.ts`: compose all market modules for the live page
- `src/content/index.ts`: route-match and boot the market controller
## Task 1: Initialize the Empty Project
**Files:**
- Create: `package.json`
- Create: `tsconfig.json`
- Create: `vitest.config.ts`
- Create: `scripts/build.mjs`
- Create: `src/manifest.json`
- Create: `src/content/index.ts`
- [ ] **Step 1: Initialize the repository and npm package**
Run:
```bash
git init
npm init -y
```
Expected: a new `.git/` directory and baseline `package.json`.
- [ ] **Step 2: Add build and test dependencies**
Run:
```bash
npm install -D typescript vitest jsdom tsup
```
Expected: `package-lock.json` created and dev dependencies added.
- [ ] **Step 3: Write the minimal config and manifest files**
Create direct setup files:
- `package.json` scripts: `build`, `test`, `test:watch`
- `tsconfig.json`
- `vitest.config.ts`
- `scripts/build.mjs`
- `src/manifest.json`
- `src/content/index.ts`
The initial manifest should only match the Xingtu market page and load one content entry bundle.
- [ ] **Step 4: Run the test command to verify the toolchain is wired**
Run:
```bash
npm test
```
Expected: Vitest runs with zero or pending test files, without config errors.
- [ ] **Step 5: Run the build command to verify the extension bundle layout**
Run:
```bash
npm run build
```
Expected: build succeeds and writes a minimal `dist/` layout containing the manifest and content bundle.
- [ ] **Step 6: Commit the bootstrap**
Run:
```bash
git add package.json package-lock.json tsconfig.json vitest.config.ts scripts/build.mjs src/manifest.json src/content/index.ts
git commit -m "chore: bootstrap mv3 extension project"
```
## Task 2: Implement Rate Normalization
**Files:**
- Create: `src/shared/rate-normalizer.ts`
- Test: `tests/rate-normalizer.test.ts`
- [ ] **Step 1: Write the failing tests for normalization and comparable lower bounds**
Add tests covering:
```ts
import { describe, expect, test } from "vitest";
import {
compareRateValues,
normalizeRateDisplay,
parseRateLowerBound
} from "../src/shared/rate-normalizer";
describe("rate-normalizer", () => {
test("normalizes compact ranges", () => {
expect(normalizeRateDisplay("0.5%-1%")).toBe("0.5% - 1%");
});
test("parses the lower bound of a range", () => {
expect(parseRateLowerBound("0.02% - 0.1%")).toBe(0.02);
});
test("treats less-than values as smaller than the boundary range", () => {
expect(compareRateValues("<0.02%", "0.02% - 0.1%")).toBeLessThan(0);
});
});
```
- [ ] **Step 2: Run the rate normalizer test file and verify RED**
Run:
```bash
npm test -- tests/rate-normalizer.test.ts
```
Expected: FAIL because the module does not exist yet.
- [ ] **Step 3: Write the minimal implementation**
Implement only:
- display normalization for compact range strings
- lower-bound parsing
- comparison helper that orders invalid or missing values last
- [ ] **Step 4: Run the test file and verify GREEN**
Run:
```bash
npm test -- tests/rate-normalizer.test.ts
```
Expected: PASS.
- [ ] **Step 5: Refactor for one parsing pipeline**
Keep parsing logic in one shared path so filtering, sorting, and export do not interpret values differently.
- [ ] **Step 6: Re-run the test file after refactor**
Run:
```bash
npm test -- tests/rate-normalizer.test.ts
```
Expected: PASS.
- [ ] **Step 7: Commit**
Run:
```bash
git add src/shared/rate-normalizer.ts tests/rate-normalizer.test.ts
git commit -m "feat: add rate normalization helpers"
```
## Task 3: Implement the ASE API Client
**Files:**
- Create: `src/content/market/api-client.ts`
- Create: `src/content/market/types.ts`
- Test: `tests/market-api-client.test.ts`
- [ ] **Step 1: Write the failing tests for response mapping and failure states**
Cover:
- successful mapping from `avg_search_after_view_rate`
- successful mapping from `personal_avg_search_after_view_rate`
- `missing` or `failed` behavior when fields are absent
- timeout or non-OK response failure mapping
Example test seed:
```ts
test("maps a valid ASE payload into normalized rates", async () => {
const client = createMarketApiClient({
fetchImpl: async () => ({
ok: true,
json: async () => ({
data: {
avg_search_after_view_rate: "<0.02%",
personal_avg_search_after_view_rate: "0.02 - 0.1%"
}
})
})
});
await expect(client.loadAuthorAseInfo("123")).resolves.toMatchObject({
success: true,
rates: {
singleVideoAfterSearchRate: "<0.02%",
personalVideoAfterSearchRate: "0.02% - 0.1%"
}
});
});
```
- [ ] **Step 2: Run the API client test file and verify RED**
Run:
```bash
npm test -- tests/market-api-client.test.ts
```
Expected: FAIL because the client module does not exist yet.
- [ ] **Step 3: Write the minimal implementation**
Implement:
- request builder for `/gw/api/aggregator/get_author_ase_info`
- response mapping
- timeout handling
- normalized success result and stable failure reasons
- [ ] **Step 4: Run the API client tests and verify GREEN**
Run:
```bash
npm test -- tests/market-api-client.test.ts
```
Expected: PASS.
- [ ] **Step 5: Refactor shared types**
Move reusable market result types into `src/content/market/types.ts`.
- [ ] **Step 6: Re-run the API client tests**
Run:
```bash
npm test -- tests/market-api-client.test.ts
```
Expected: PASS.
- [ ] **Step 7: Commit**
Run:
```bash
git add src/content/market/api-client.ts src/content/market/types.ts tests/market-api-client.test.ts
git commit -m "feat: add ase api client"
```
## Task 4: Implement the Result Store
**Files:**
- Create: `src/content/market/result-store.ts`
- Modify: `src/content/market/types.ts`
- Test: `tests/result-store.test.ts`
- [ ] **Step 1: Write the failing tests for merged record lifecycle**
Cover:
- create loading records from current-page rows
- update one author to success
- preserve failed authors instead of dropping them
- dedupe the same author across pages
- keep stable major fields after repeated writes
- [ ] **Step 2: Run the result store test file and verify RED**
Run:
```bash
npm test -- tests/result-store.test.ts
```
Expected: FAIL because the store module does not exist yet.
- [ ] **Step 3: Write the minimal implementation**
Implement:
- `upsertMarketRow`
- `setAuthorLoading`
- `setAuthorSuccess`
- `setAuthorFailed`
- `listRecords`
- `getRecord`
- [ ] **Step 4: Run the result store tests and verify GREEN**
Run:
```bash
npm test -- tests/result-store.test.ts
```
Expected: PASS.
- [ ] **Step 5: Refactor state transitions into one reducer-style path**
Keep state transitions explicit so later full-scan orchestration does not spread status logic across files.
- [ ] **Step 6: Re-run the result store tests**
Run:
```bash
npm test -- tests/result-store.test.ts
```
Expected: PASS.
- [ ] **Step 7: Commit**
Run:
```bash
git add src/content/market/result-store.ts src/content/market/types.ts tests/result-store.test.ts
git commit -m "feat: add market result store"
```
## Task 5: Implement Filter and Sort Control
**Files:**
- Create: `src/content/market/filter-sort-controller.ts`
- Modify: `src/content/market/types.ts`
- Test: `tests/filter-sort-controller.test.ts`
- [ ] **Step 1: Write the failing tests for threshold filtering and lower-bound sorting**
Cover:
- pass only when lower bound is greater than or equal to the threshold
- sort single-rate descending
- sort personal-rate ascending
- keep failed or missing rows at the end
- [ ] **Step 2: Run the filter/sort tests and verify RED**
Run:
```bash
npm test -- tests/filter-sort-controller.test.ts
```
Expected: FAIL because the controller module does not exist yet.
- [ ] **Step 3: Write the minimal implementation**
Implement:
- filter state type
- sort state type
- record predicate
- comparator
- application helper returning ordered visible record IDs
- [ ] **Step 4: Run the filter/sort tests and verify GREEN**
Run:
```bash
npm test -- tests/filter-sort-controller.test.ts
```
Expected: PASS.
- [ ] **Step 5: Refactor duplicated comparison branches**
Keep one comparison path per metric and one final fallback ordering rule.
- [ ] **Step 6: Re-run the filter/sort tests**
Run:
```bash
npm test -- tests/filter-sort-controller.test.ts
```
Expected: PASS.
- [ ] **Step 7: Commit**
Run:
```bash
git add src/content/market/filter-sort-controller.ts src/content/market/types.ts tests/filter-sort-controller.test.ts
git commit -m "feat: add filter and sort controller"
```
## Task 6: Implement CSV Export
**Files:**
- Create: `src/shared/csv.ts`
- Create: `src/content/market/csv-exporter.ts`
- Test: `tests/csv-exporter.test.ts`
- [ ] **Step 1: Write the failing tests for CSV serialization**
Cover:
- header order
- escaped commas and quotes
- failed rows emitting empty rate fields plus status
- export rows using normalized display values
- [ ] **Step 2: Run the CSV exporter tests and verify RED**
Run:
```bash
npm test -- tests/csv-exporter.test.ts
```
Expected: FAIL because the exporter module does not exist yet.
- [ ] **Step 3: Write the minimal implementation**
Implement:
- CSV escaping helper
- column definition list
- row-to-CSV conversion
- blob-ready text generation function
- [ ] **Step 4: Run the CSV exporter tests and verify GREEN**
Run:
```bash
npm test -- tests/csv-exporter.test.ts
```
Expected: PASS.
- [ ] **Step 5: Refactor field mapping into one declarative schema**
Keep export field order and labels in one place.
- [ ] **Step 6: Re-run the CSV exporter tests**
Run:
```bash
npm test -- tests/csv-exporter.test.ts
```
Expected: PASS.
- [ ] **Step 7: Commit**
Run:
```bash
git add src/shared/csv.ts src/content/market/csv-exporter.ts tests/csv-exporter.test.ts
git commit -m "feat: add csv exporter"
```
## Task 7: Implement Current-Page DOM Sync
**Files:**
- Create: `src/content/market/dom-sync.ts`
- Test: `tests/market-dom-sync.test.ts`
- [ ] **Step 1: Write the failing DOM tests for current-page enhancement**
Use jsdom fixtures that mimic the Xingtu market grid and cover:
- injecting the two header cells
- injecting one pair of per-row cells
- rendering loading, success, and failed states
- hiding filtered rows
- reordering rows based on an ordered ID list
- [ ] **Step 2: Run the DOM sync tests and verify RED**
Run:
```bash
npm test -- tests/market-dom-sync.test.ts
```
Expected: FAIL because the module does not exist yet.
- [ ] **Step 3: Write the minimal implementation**
Implement:
- page structure discovery
- row extraction with major field snapshots
- inserted-cell rendering
- row hide/show
- row order application
- [ ] **Step 4: Run the DOM sync tests and verify GREEN**
Run:
```bash
npm test -- tests/market-dom-sync.test.ts
```
Expected: PASS.
- [ ] **Step 5: Refactor DOM selectors into one locator object**
This keeps future layout changes isolated.
- [ ] **Step 6: Re-run the DOM sync tests**
Run:
```bash
npm test -- tests/market-dom-sync.test.ts
```
Expected: PASS.
- [ ] **Step 7: Commit**
Run:
```bash
git add src/content/market/dom-sync.ts tests/market-dom-sync.test.ts
git commit -m "feat: add market dom sync"
```
## Task 8: Implement Full-Scan Pagination Control
**Files:**
- Create: `src/content/market/full-scan-controller.ts`
- Modify: `src/content/market/types.ts`
- Test: `tests/full-scan-controller.test.ts`
- [ ] **Step 1: Write the failing tests for on-demand full scans**
Cover:
- initial page load does not start full scan
- filter action starts full scan
- sort action starts full scan
- export action starts full scan
- repeated actions do not restart a completed scan unnecessarily
- failed author fetches are recorded but do not abort the whole scan
- [ ] **Step 2: Run the full-scan tests and verify RED**
Run:
```bash
npm test -- tests/full-scan-controller.test.ts
```
Expected: FAIL because the controller module does not exist yet.
- [ ] **Step 3: Write the minimal implementation**
Implement the controller around injected dependencies:
- current-page row reader
- paginator
- author metrics loader
- result store writer
The controller should expose one explicit method per trigger source, such as:
- `ensureScanForFilter()`
- `ensureScanForSort()`
- `ensureScanForExport()`
- [ ] **Step 4: Run the full-scan tests and verify GREEN**
Run:
```bash
npm test -- tests/full-scan-controller.test.ts
```
Expected: PASS.
- [ ] **Step 5: Refactor to a single idempotent scan path**
All trigger entry points should delegate to the same internal scan routine.
- [ ] **Step 6: Re-run the full-scan tests**
Run:
```bash
npm test -- tests/full-scan-controller.test.ts
```
Expected: PASS.
- [ ] **Step 7: Commit**
Run:
```bash
git add src/content/market/full-scan-controller.ts src/content/market/types.ts tests/full-scan-controller.test.ts
git commit -m "feat: add full scan controller"
```
## Task 9: Implement the Plugin Toolbar and Market Composition
**Files:**
- Create: `src/content/market/plugin-toolbar.ts`
- Create: `src/content/market/index.ts`
- Modify: `src/content/index.ts`
- Test: `tests/market-content-entry.test.ts`
- [ ] **Step 1: Write the failing integration tests for the page entry flow**
Cover:
- market controller boots on the Xingtu market URL
- current page rows are hydrated on start
- applying plugin filters triggers full scan and hides non-matching rows
- applying plugin sorting triggers full scan and reorders rows
- export triggers full scan and hands the ordered visible records to the CSV exporter
- [ ] **Step 2: Run the market entry tests and verify RED**
Run:
```bash
npm test -- tests/market-content-entry.test.ts
```
Expected: FAIL because the market controller composition does not exist yet.
- [ ] **Step 3: Write the minimal implementation**
Implement:
- toolbar controls for threshold inputs, sort selector, and export button
- event wiring from toolbar to full scan, filter/sort controller, and exporter
- market page bootstrap in `src/content/index.ts`
- [ ] **Step 4: Run the market entry tests and verify GREEN**
Run:
```bash
npm test -- tests/market-content-entry.test.ts
```
Expected: PASS.
- [ ] **Step 5: Run the whole test suite**
Run:
```bash
npm test
```
Expected: all test files PASS.
- [ ] **Step 6: Run the production build**
Run:
```bash
npm run build
```
Expected: extension build succeeds with all planned bundles and manifest output.
- [ ] **Step 7: Commit**
Run:
```bash
git add src/content/index.ts src/content/market/index.ts src/content/market/plugin-toolbar.ts tests/market-content-entry.test.ts
git commit -m "feat: wire market plugin controls"
```
## Task 10: Manual Verification and Documentation
**Files:**
- Modify: `externaldocs/2026-04-20-star-chart-market-after-search-rate-plugin-spec.md` (only if actual implementation forces scope adjustments)
- Create: `README.md`
- [ ] **Step 1: Write a minimal README**
Document:
- install
- test
- build
- load unpacked extension
- manual verification checklist
- [ ] **Step 2: Run manual verification on the live page**
Verify:
- current page gets two new columns
- loading, success, failed states render correctly
- filter triggers scan and hides unmatched rows
- sort triggers scan and reorders rows
- export produces a CSV with plugin status fields
- [ ] **Step 3: Update the spec if implementation realities changed any promised behavior**
Only adjust documented scope if the live page proves a requirement impossible or unstable.
- [ ] **Step 4: Run the full test suite again**
Run:
```bash
npm test
npm run build
```
Expected: both commands succeed.
- [ ] **Step 5: Commit**
Run:
```bash
git add README.md externaldocs/2026-04-20-star-chart-market-after-search-rate-plugin-spec.md
git commit -m "docs: add usage and verification notes"
```
## Notes for Execution
- Do not write production behavior before the corresponding failing test exists.
- Keep DOM selectors isolated; the Xingtu page is likely to shift.
- Reuse the old project's verified API field names, but do not copy large modules blindly; re-derive them under tests.
- For full-scan logic, dependency-inject pagination and row reading so the orchestration stays testable.
- If a task reveals a missing boundary in the spec, pause and update the spec before continuing.

View File

@ -0,0 +1,464 @@
# 星图达人选取页看后搜率增强插件需求与 TDD 规格
## 1. 文档目的
本文档用于定义一个新的浏览器扩展项目规格。项目目标是在巨量星图达人选取页展示达人详情页中的两个看后搜率指标,并由插件自身提供筛选、排序、导出能力。
本文档是需求基线和 TDD 基线,不是实现记录。
## 2. 背景
当前巨量星图达人详情页已经可以看到两个关键指标:
- 单视频看后搜率
- 个人视频看后搜率
业务目标是把这两个指标前置到达人选取页,减少逐个点进详情页判断的成本。
已知旧项目已经验证过两类能力:
- 可以从详情页侧拿到这两个指标
- 可以通过接口 `/gw/api/aggregator/get_author_ase_info` 按达人维度获取这两个指标,并在列表页按行展示
因此,新项目不再把重点放在“能不能拿到数据”,而是放在“如何把这两个指标稳定地集成到列表页工作流中”。
## 3. 已确认决策
以下内容已在需求澄清中确认,后续实现默认以此为准:
- 插件不依赖星图原生的自定义指标、排序、导出能力
- 展示、筛选、排序、导出全部由插件自己提供
- 筛选、排序结果直接作用在当前星图列表 DOM 上,不做独立侧边面板
- 导出范围为“当前筛选条件下,插件实际抓取到的所有页结果”
- 导出通过插件自动逐页翻页抓取实现,不走批量接口全量拉取
- 导出字段为“列表已有主要字段 + 插件新增字段”
- 筛选方式为阈值筛选
- 阈值比较按区间下限判断
- 排序按区间下限排序
- 进入列表页后默认只补当前页数据
- 只有在用户触发插件筛选、排序或导出时,插件才开始逐页抓取全量结果
- 部分达人抓取失败时,保留该达人,字段标记为空值或失败状态,不中断全量任务
- 默认导出格式为 `CSV`
## 4. 产品目标
插件第一版需要实现以下目标:
1. 在巨量星图达人选取页中新增两列看后搜率
2. 进入页面后,优先补全当前页可见达人数据
3. 当用户触发插件筛选、排序或导出时,插件自动逐页抓取当前筛选条件下的全部页结果
4. 基于插件聚合结果,对当前页面列表执行过滤和重排
5. 将聚合后的结果导出为 `CSV`
## 5. 非目标
第一版明确不做以下内容:
- 不接入星图原生筛选控件
- 不接入星图原生排序控件
- 不接入星图原生导出能力
- 不做独立后台服务或数据库
- 不做云端同步
- 不做跨浏览器兼容优化
- 不做插件弹窗里的复杂报表视图
- 不做“进入列表页立即全量抓取”
- 不承诺导出站点原列表中的全部字段,只保证“主要字段 + 新增字段”
## 6. 页面范围
第一版页面范围只覆盖巨量星图达人选取页:
- `https://xingtu.cn/ad/creator/market`
- 或同类实际列表路由
插件只对该列表场景生效,不在达人详情页增加新的交互能力。详情页仅作为数据字段来源的业务参考,不作为第一版主视图。
## 7. 用户故事
### 7.1 浏览当前页
作为投放或选达人用户,我打开达人选取页后,希望直接在列表里看到两个看后搜率,而不是逐条点进详情页查看。
### 7.2 基于看后搜率筛人
作为用户,我希望设置两个阈值,例如:
- 单视频看后搜率 `>= 0.1%`
- 个人视频看后搜率 `>= 0.05%`
插件按我的阈值过滤达人行。
### 7.3 基于看后搜率排序
作为用户,我希望按照某一个看后搜率从高到低或从低到高重排行列表。
### 7.4 导出当前筛选条件下的结果
作为用户,我希望导出当前筛选条件下的全部页达人数据,而不是只导出当前可见页。
## 8. 功能需求
### 8.1 列表页列增强
插件需要在达人选取页中增加两列:
- `单视频看后搜率`
- `个人视频看后搜率`
列增强要求:
- 默认进入页面时为当前页每个达人触发数据获取
- 数据加载中时显示明确的中间态
- 获取成功时显示格式化后的值
- 获取失败时显示失败态,且失败态与空值态可区分
- 当前页发生翻页、筛选、搜索、排序等导致列表变化时,插件需要重新识别当前页达人并补列
### 8.2 插件筛选
插件需要提供两个阈值输入:
- `单视频看后搜率 >= X`
- `个人视频看后搜率 >= Y`
行为要求:
- 当用户点击应用筛选时,如果全量结果尚未抓齐,插件先启动全量翻页抓取
- 全量抓取完成后,再基于聚合结果计算筛选结果
- 页面中不符合条件的达人行从当前插件视角中隐藏
- 抓取失败的达人默认不通过阈值筛选,但仍在内部结果集中保留失败状态
### 8.3 插件排序
插件需要支持按以下字段排序:
- 单视频看后搜率
- 个人视频看后搜率
每个字段至少支持:
- 升序
- 降序
行为要求:
- 当用户触发排序时,如果全量结果尚未抓齐,插件先启动全量翻页抓取
- 全量抓取完成后,插件基于聚合结果对当前列表进行重排
- 排序结果应与筛选后的可见结果保持一致
- 失败或空值默认排在末尾
### 8.4 插件导出
插件需要提供一个 `CSV` 导出入口。
导出要求:
- 导出前如未完成全量抓取,插件先完成当前筛选条件下的逐页抓取
- 导出结果包含当前筛选条件下的全部页结果,而非仅当前页
- 导出中保留失败达人,但对应看后搜率字段为空,另增加状态字段标记成功或失败
导出字段建议至少包含:
- 达人 ID
- 达人名称
- 地区
- 达人类型
- 内容主题
- 粉丝数
- 预期 CPM
- 预期播放量
- 互动率
- 完播率
- 21-60s 报价
- 单视频看后搜率
- 个人视频看后搜率
- 插件数据状态
说明:
- “列表已有主要字段”以页面当时稳定可读取到的列为准
- 如果某些页面字段无法稳定解析,允许在实现阶段缩减导出字段,但必须在交付前更新本文档或补充实现说明
### 8.5 全量翻页抓取
插件在以下场景触发全量翻页抓取:
- 用户应用插件筛选
- 用户应用插件排序
- 用户点击导出
全量抓取行为要求:
- 从当前筛选条件下的第一页开始逐页遍历
- 对每页达人提取基础列表字段
- 对每个达人拉取两个看后搜率
- 聚合到插件内部结果集
- 自动处理分页推进
- 自动处理重复达人去重
- 记录当前进度和失败数量
- 可在全部完成后恢复用户页面到一个明确状态
第一版不要求:
- 暂停/继续
- 后台静默抓取
- 抓取任务跨页面持久化恢复
## 9. 数值规则
两个看后搜率原始值可能出现如下形式:
- `0.02% - 0.1%`
- `0.5%-1%`
- `<0.02%`
第一版统一转换为可比较的“下限值”。
规则如下:
- `0.02% - 0.1%` 的比较值为 `0.02`
- `0.5%-1%` 的比较值为 `0.5`
- `<0.02%` 的比较值视为小于 `0.02% - 0.1%`
用于筛选时:
- 只有当下限值 `>=` 用户阈值时,才算通过
用于排序时:
- 按下限值进行排序
- 失败值、空值排在末尾
该规则必须在展示逻辑、筛选逻辑、排序逻辑、导出逻辑中保持一致,不能出现各模块自行解释字符串的情况。
## 10. 结果状态定义
每个达人在插件结果集中至少需要有以下状态之一:
- `success`
- `loading`
- `failed`
- `missing`
建议语义:
- `success`:已成功拿到两个看后搜率
- `loading`:当前页初次补列或全量抓取过程中尚未返回
- `failed`:接口请求失败、超时或返回异常
- `missing`:接口返回成功,但缺少完整字段
页面展示、筛选、排序、导出都需要基于统一状态模型,而不是分别发明一套状态。
## 11. 推荐架构
第一版采用“最小增量路线”。
### 11.1 模块划分
建议至少拆分为以下模块:
- `market-dom-sync`
- 负责识别列表结构、插入两列、读取当前页主要字段、重排或隐藏达人行
- `ase-api-client`
- 负责按达人 ID 拉取 `/gw/api/aggregator/get_author_ase_info`
- 只处理网络请求和响应映射
- `rate-normalizer`
- 负责把展示值转成统一字符串和比较值
- `result-store`
- 负责维护插件内部达人结果集
- 负责去重、状态更新、失败保留
- `full-scan-controller`
- 负责在用户触发筛选、排序、导出时驱动翻页抓取
- 负责进度、终止条件和错误统计
- `filter-sort-controller`
- 负责阈值筛选和排序计算
- `csv-exporter`
- 负责导出字段映射和 `CSV` 内容生成
### 11.2 数据流
默认进入列表页时:
1. 内容脚本识别当前页达人行
2. 为当前页达人补列
3. 调用达人 ASE 接口拉取两项指标
4. 写入结果集并更新行展示
用户触发筛选、排序或导出时:
1. 检查全量结果是否已覆盖当前筛选条件下的全部页
2. 如果未覆盖,则启动全量翻页抓取
3. 每抓到一页就写入结果集
4. 全量抓取结束后执行筛选或排序
5. 如果是导出,则根据聚合结果生成 `CSV`
## 12. 验收标准
### 12.1 展示验收
- 打开达人选取页后,当前页每个达人都能看到两个新增列位
- 加载中、成功、失败三种状态可区分
- 翻页或列表条件变化后,新页仍能自动补列
### 12.2 筛选验收
- 输入阈值后,插件会在必要时自动抓取全部页结果
- 抓取完成后,页面只显示满足阈值条件的达人
- 使用下限规则判断区间值
### 12.3 排序验收
- 选择看后搜率排序后,插件会在必要时自动抓取全部页结果
- 抓取完成后,页面达人顺序按下限规则正确变化
- 失败值、空值排在末尾
### 12.4 导出验收
- 点击导出后,插件会在必要时自动抓取全部页结果
- 导出文件格式为 `CSV`
- 导出结果包含插件新增字段
- 部分失败达人不会导致导出中断
## 13. 风险与约束
- 星图列表页不是传统 `table`而是按列或网格拼装DOM 重排需要谨慎处理
- 自动翻页抓取可能受到页面节流、懒加载、虚拟列表、分页组件异步行为影响
- “列表已有主要字段”的读取稳定性依赖页面真实结构
- 如果站点接口字段变化,达人 ASE 数据映射可能失效
- 全量抓取耗时与当前筛选条件下结果页数线性相关
## 14. TDD 策略
本项目必须采用 TDD。优先从纯逻辑模块开始最后才进入 DOM 与流程控制层。
### 14.1 测试分层
第一层:纯函数测试
- 看后搜率字符串标准化
- 区间下限解析
- `<0.02%` 之类特殊值解析
- 阈值比较
- 排序比较器
- 接口响应映射
- 导出字段序列化
第二层:状态与结果集测试
- 达人结果创建
- 同一达人重复写入去重
- 成功覆盖 loading
- 失败状态保留
- 缺字段映射为 `missing`
第三层:流程测试
- 仅进入列表页时,只抓当前页
- 触发筛选时才启动全量翻页抓取
- 触发排序时才启动全量翻页抓取
- 触发导出时才启动全量翻页抓取
- 抓取过程中翻页推进正确
- 部分页失败不会导致整体任务中止
第四层DOM 集成测试
- 正确插入两列
- 成功态渲染正确
- 失败态渲染正确
- 筛选后隐藏不符合条件的行
- 排序后行顺序变化正确
### 14.2 推荐先写的测试
建议严格按下面顺序推进:
1. `rate-normalizer` 测试
2. `ase-api-client` 响应映射测试
3. `filter-sort-controller` 测试
4. `result-store` 测试
5. `csv-exporter` 测试
6. `full-scan-controller` 测试
7. `market-dom-sync` 测试
### 14.3 首批必备测试用例
至少覆盖以下用例:
- `0.5%-1%` 被标准化为 `0.5% - 1%`
- `0.02% - 0.1%` 的下限被解析为 `0.02`
- `<0.02%` 排在 `0.02% - 0.1%` 之前
- 区间下限低于阈值时不通过筛选
- 区间下限高于或等于阈值时通过筛选
- 成功值排序高于失败值和空值
- 接口成功但字段缺失时映射为 `missing`
- 同一达人跨页重复出现时只保留一条聚合记录
- 仅查看当前页时不会自动翻完整分页
- 点击导出时会启动全量翻页抓取
- 部分达人请求失败时,导出仍然成功
- `CSV` 中失败达人状态字段正确输出
### 14.4 人工验证最小清单
除了自动化测试,还需要保留最小人工验证:
1. 加载插件
2. 打开达人选取页
3. 确认当前页新增两列
4. 确认筛选触发全量抓取
5. 确认排序结果与下限规则一致
6. 确认导出得到 `CSV`
7. 抽样比对若干达人导出的两个看后搜率与详情页是否一致
## 15. 建议的初始目录结构
```text
src/
manifest.json
content/
index.ts
market/
index.ts
dom-sync.ts
api-client.ts
full-scan-controller.ts
filter-sort-controller.ts
result-store.ts
csv-exporter.ts
shared/
rate-normalizer.ts
result-types.ts
csv-types.ts
tests/
rate-normalizer.test.ts
market-api-client.test.ts
filter-sort-controller.test.ts
result-store.test.ts
full-scan-controller.test.ts
market-dom-sync.test.ts
csv-exporter.test.ts
```
## 16. 版本边界
第一版交付完成的定义是:
- 当前页列增强可用
- 插件筛选可用
- 插件排序可用
- 插件全量翻页抓取可用
- `CSV` 导出可用
- 关键逻辑有自动化测试保护
如果后续要扩展第二版,优先方向应为:
- 抓取进度 UI
- 全量任务取消能力
- 更稳定的字段导出映射
- 本地缓存与重复抓取优化
- 更完整的插件操作栏交互