feat(api): add jd live session previews

This commit is contained in:
renzhiye 2026-04-02 20:09:15 +08:00
parent 145f958663
commit 4800a375ee
9 changed files with 1594 additions and 3 deletions

101
TODO.md Normal file
View File

@ -0,0 +1,101 @@
# TODO
- 更新时间2026-04-02
- 进度基线2026-04-02 已完成一次 MVP 收敛;`npm run test``npm run typecheck` 通过Web/API 实机流程已验证JD 实时会话导入与 `search/detail/reviews` preview 已实机验证
- 关联文档:
- `docs/tasks.md`
- `docs/DevelopmentPlan.md`
- `docs/tdd.md`
## 维护约定
- 已完成任务统一使用 `- [x]`
- 未完成任务统一使用 `- [ ]`
- 进行中任务使用 `- [ ] ...(进行中)`
- 阻塞任务使用 `- [ ] ...(阻塞:原因)`
- 任务编号必须与 `docs/tasks.md` 对齐;若任务拆分、合并或改号,两个文件必须同步更新
## 当前主线
- [x] `S1-06` 会话中心 v1 与全局会话准备后端入口落地MVP mock 版,支持 24h 会话、清理与回跳)
- [x] `S1-07` 新建任务页与全局会话准备入口落地
- [x] `S2-01` 首个平台预检查与搜索适配器落地MVP mock 版)
- [x] `S2-05` 标准化 v1 与最小报告快照落地(规则版)
- [ ] `S2-06` 单平台执行页闭环与回归包落地(进行中:闭环已可演示,回归包与真实异步执行待补)
- [ ] `S3-01` 第二平台 `precheck/search/detail/reviews` 适配器落地(进行中:当前双平台仍以 mock 适配为主)
- [ ] `S3-03` 阻塞恢复与 `L3 Browser Recovery` 落地(进行中:恢复页与重试链路已通,真实远程浏览器接管待补)
- [ ] `S4-02` AI 结构化报告生成与版本规则落地(进行中:版本规则已落地,真实 AI 生成待接入)
- [ ] `S4-05` 留存、删除 API 与联动清理链路落地(进行中:删除 API 与 30/90 天本地清理作业已落地,对象存储联动待补)
- [ ] `S4-06` 完整可观测性与审计日志落地进行中overview / audit 已有,完整指标体系待补)
## 阶段快照
- [ ] `S0` 双平台能力矩阵、fixture/HAR、PoC 验证与 `strategy_attempts` 口径仍未冻结(进行中)
- [ ] `S1` 本地 JSON 持久化、API/BFF、会话准备、新建任务页与状态机骨架已可用但数据库、队列、真实 `SSE` 仍未完成(进行中)
- [ ] `S2` 单平台最小闭环和最小报告已可演示JD `search/detail/reviews` 实时 preview 已验证,但任务执行与标准化主链仍以 mock 数据为主(进行中)
- [ ] `S3` 双平台候选确认、执行控制台、恢复页与平台级重试已可用,但第二平台真实适配、`L2` 模板刷新与真实 `L3` 恢复未完成(进行中)
- [ ] `S4` 报告版本规则、报告页、历史任务页、版本切换、删除入口与观测概览已落地,但完整聚合、真实 AI、对象存储联动与完整审计仍未完成进行中
- [ ] `S5` 稳定性、性能、UAT、部署与发布准备尚未进入实施未开始
## `S0`
- [ ] `S0-01` 冻结双平台能力矩阵(未开始)
- [ ] `S0-02` 产出双平台首批 fixture 与 HAR 样本(未开始)
- [ ] `S0-03` 验证服务端受控浏览器与会话快照 PoC进行中
- [ ] `S0-04` 验证至少一个平台的非浏览器主路径 PoC进行中JD 已完成授权会话下 `search/detail/reviews` 实时 API 预览验证,待补模板刷新与量化口径)
- [x] `S0-05` 搭建 Monorepo 与基础工程骨架
- [ ] `S0-06` 冻结 Phase 0 量化评分表、`strategy_attempts` 记录格式与进入开发门槛(未开始)
## `S1`
- [x] `S1-01` 共享领域模型与枚举包落地
- [ ] `S1-02` 数据库、事件日志与对象存储模型落地进行中MVP 先落本地 JSON 持久化,正式数据库与对象存储待补)
- [ ] `S1-03` 任务编排、事件持久化与状态机骨架落地(进行中:状态机、事件日志、平台级重试已可用,队列化执行待补)
- [ ] `S1-04` API / BFF、平台就绪摘要与 `SSE` 基础接口落地进行中REST/BFF 已可用,`SSE` 仍是最小 snapshot 形态)
- [x] `S1-05` Web 工作台基础壳层与核心路由落地
- [x] `S1-06` 会话中心 v1 与全局会话准备后端入口落地MVP mock 版)
- [x] `S1-07` 新建任务页与全局会话准备入口落地
- [ ] `S1-08` TDD 与 CI 基础链路落地(进行中)
## `S2`
- [x] `S2-01` 首个平台预检查与搜索适配器落地MVP mock 版)
- [x] `S2-02` 候选确认页与确认 API 落地
- [ ] `S2-03` 单平台商品详情抓取链路落地进行中JD live detail preview 已接入真实 `pc_detailpage_wareBusiness`,会话导入与解析已验证,待纳入任务执行与标准化主链)
- [ ] `S2-04` 单平台评论采集与抽样链路落地进行中JD live reviews preview 已接入真实 `getLegoWareDetailComment`,分页参数改写与解析已验证,待纳入任务执行与抽样主链)
- [x] `S2-05` 标准化 v1 与最小报告快照落地(规则版)
- [ ] `S2-06` 单平台执行页闭环与回归包落地(进行中:新建 -> 确认 -> 执行 -> 报告已打通)
## `S3`
- [ ] `S3-01` 第二平台 `precheck/search/detail/reviews` 适配器落地(进行中)
- [ ] `S3-02` 模板刷新与 `L2` 路径落地(未开始)
- [ ] `S3-03` 阻塞恢复与 `L3 Browser Recovery` 落地(进行中)
- [ ] `S3-04` 双平台候选确认与执行控制台落地(进行中:页面与状态展示已具备,真实并发执行待补)
- [x] `S3-05` `PartialCompleted``Blocked``Failed` 汇总规则落地
- [ ] `S3-06` 双平台主回归包落地(未开始)
## `S4`
- [ ] `S4-01` 完整标准化与三级聚合落地(进行中)
- [ ] `S4-02` AI 结构化报告生成与版本规则落地(进行中)
- [ ] `S4-03` 报告页、证据抽屉与质量标记落地(进行中:报告页、质量标记与证据索引已落地,证据抽屉待补)
- [x] `S4-04` 历史任务页、版本切换与删除入口落地
- [ ] `S4-05` 留存、删除 API 与联动清理链路落地(进行中:删除 API 与 30/90 天本地清理作业已落地,对象存储联动待补)
- [ ] `S4-06` 完整可观测性与审计日志落地(进行中)
## `S5`
- [ ] `S5-01` 平台级定向重试稳定化(进行中)
- [ ] `S5-02` 性能与成本优化(未开始)
- [ ] `S5-03` UAT 与试运行任务集执行(未开始)
- [ ] `S5-04` 部署、值守、排障与热修手册落地(未开始)
- [ ] `S5-05` 最终验收与文档同步收口(未开始)
## 横向任务
- [ ] `X-01` 上下游文档变更同步(进行中)
- [ ] `X-02` 安全与合规检查(未开始)
- [ ] `X-03` 测试资产维护(进行中)
- [ ] `X-04` 设计一致性与可访问性检查(进行中)
- [ ] `X-05` 观测指标复盘(未开始)

View File

@ -0,0 +1,372 @@
import {
parseJdDetailApiResponse,
parseJdReviewsApiResponse,
parseJdSearchApiResponse,
parseJdSearchHtml
} from "./parsers";
import type {
JdDetailPreviewResult,
JdLiveService,
JdLiveSessionInput,
JdLiveSessionSummary,
JdReviewsPreviewResult,
JdSearchMode,
JdSearchPreviewResult,
JdTemplateSummary
} from "./types";
import { firstString, readQueryBody, withUpdatedQueryBody } from "./utils";
const DEFAULT_JD_USER_AGENT =
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 " +
"(KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36";
type StoredJdLiveSession = {
cookieHeader: string;
importedAt: string;
userAgent: string;
searchApiTemplateUrl?: string | undefined;
detailTemplateUrl?: string | undefined;
reviewsTemplateUrl?: string | undefined;
searchReferer?: string | undefined;
detailReferer?: string | undefined;
};
class JdLiveError extends Error {
constructor(
message: string,
readonly statusCode: number = 400
) {
super(message);
this.name = "JdLiveError";
}
}
function nowIso(): string {
return new Date().toISOString();
}
function readEnvSession(): StoredJdLiveSession | null {
const cookieHeader = process.env.JD_COOKIE_HEADER?.trim();
if (!cookieHeader) {
return null;
}
const searchApiTemplateUrl = process.env.JD_SEARCH_API_TEMPLATE_URL?.trim();
const detailTemplateUrl = process.env.JD_DETAIL_TEMPLATE_URL?.trim();
const reviewsTemplateUrl = process.env.JD_REVIEWS_TEMPLATE_URL?.trim();
const searchReferer = process.env.JD_SEARCH_REFERER?.trim();
const detailReferer = process.env.JD_DETAIL_REFERER?.trim();
return {
cookieHeader,
importedAt: nowIso(),
userAgent: process.env.JD_USER_AGENT?.trim() || DEFAULT_JD_USER_AGENT,
...(searchApiTemplateUrl ? { searchApiTemplateUrl } : {}),
...(detailTemplateUrl ? { detailTemplateUrl } : {}),
...(reviewsTemplateUrl ? { reviewsTemplateUrl } : {}),
...(searchReferer ? { searchReferer } : {}),
...(detailReferer ? { detailReferer } : {})
};
}
function requireNonEmptyCookie(cookieHeader: string): string {
const normalized = cookieHeader.trim();
if (!normalized) {
throw new JdLiveError("cookieHeader is required for JD live requests.");
}
return normalized;
}
function extractTemplateSkuId(templateUrl: string | undefined): string | undefined {
if (!templateUrl) {
return undefined;
}
const url = new URL(templateUrl);
const body = readQueryBody(url);
return firstString(body?.skuId, body?.sku) ?? undefined;
}
function buildTemplateSummary(templateUrl: string | undefined): JdTemplateSummary {
const skuId = extractTemplateSkuId(templateUrl);
return {
available: Boolean(templateUrl),
...(skuId ? { skuId } : {})
};
}
function templateMatchesQuery(
templateUrl: string | undefined,
query: string
): boolean {
if (!templateUrl) {
return false;
}
const templateKeyword = new URL(templateUrl).searchParams.get("keyword");
return Boolean(templateKeyword && templateKeyword === query);
}
async function fetchTextOrThrow(
url: string,
init: RequestInit,
sessionExpiredMessage: string
): Promise<{ finalUrl: string; text: string }> {
let response: Response;
try {
response = await fetch(url, {
...init,
redirect: "follow"
});
} catch (error) {
throw new JdLiveError(
`JD live request failed before receiving a response: ${
error instanceof Error ? error.message : "unknown error"
}`,
502
);
}
const text = await response.text();
if (response.url.includes("passport.jd.com") || text.includes("passport.jd.com")) {
throw new JdLiveError(sessionExpiredMessage, 409);
}
if (!response.ok) {
throw new JdLiveError(
`JD live request failed with status ${response.status}.`,
502
);
}
return {
finalUrl: response.url,
text
};
}
export function isJdLiveError(error: unknown): error is Error & { statusCode: number } {
return error instanceof Error && "statusCode" in error;
}
export class JdLiveSessionService implements JdLiveService {
private session: StoredJdLiveSession | null = readEnvSession();
getSessionSummary(): JdLiveSessionSummary {
return {
configured: Boolean(this.session),
hasCookie: Boolean(this.session?.cookieHeader),
...(this.session?.importedAt ? { importedAt: this.session.importedAt } : {}),
...(this.session?.userAgent ? { userAgent: this.session.userAgent } : {}),
searchApiTemplate: buildTemplateSummary(this.session?.searchApiTemplateUrl),
detailTemplate: buildTemplateSummary(this.session?.detailTemplateUrl),
reviewsTemplate: buildTemplateSummary(this.session?.reviewsTemplateUrl)
};
}
importSession(input: JdLiveSessionInput): JdLiveSessionSummary {
const searchApiTemplateUrl = input.searchApiTemplateUrl?.trim();
const detailTemplateUrl = input.detailTemplateUrl?.trim();
const reviewsTemplateUrl = input.reviewsTemplateUrl?.trim();
const searchReferer = input.searchReferer?.trim();
const detailReferer = input.detailReferer?.trim();
this.session = {
cookieHeader: requireNonEmptyCookie(input.cookieHeader),
importedAt: nowIso(),
userAgent: input.userAgent?.trim() || DEFAULT_JD_USER_AGENT,
...(searchApiTemplateUrl ? { searchApiTemplateUrl } : {}),
...(detailTemplateUrl ? { detailTemplateUrl } : {}),
...(reviewsTemplateUrl ? { reviewsTemplateUrl } : {}),
...(searchReferer ? { searchReferer } : {}),
...(detailReferer ? { detailReferer } : {})
};
return this.getSessionSummary();
}
clearSession(): void {
this.session = readEnvSession();
}
async previewSearch(
query: string,
mode?: JdSearchMode
): Promise<JdSearchPreviewResult> {
const session = this.requireSession();
const normalizedQuery = query.trim();
if (!normalizedQuery) {
throw new JdLiveError("query is required for JD live search preview.");
}
const resolvedMode =
mode ??
(templateMatchesQuery(session.searchApiTemplateUrl, normalizedQuery) ? "api" : "html");
if (resolvedMode === "api") {
if (!session.searchApiTemplateUrl) {
throw new JdLiveError(
"JD search API template is missing. Import a fresh search request URL or use mode=html."
);
}
const templateUrl = new URL(session.searchApiTemplateUrl);
const templateKeyword = templateUrl.searchParams.get("keyword");
if (templateKeyword && templateKeyword !== normalizedQuery) {
throw new JdLiveError(
`Imported search API template is locked to query "${templateKeyword}". ` +
"Capture a fresh request for the target query or use mode=html."
);
}
const response = await fetchTextOrThrow(
session.searchApiTemplateUrl,
{
headers: {
Accept: "application/json, text/plain, */*",
Cookie: session.cookieHeader,
Referer:
session.searchReferer ??
`https://search.jd.com/Search?keyword=${encodeURIComponent(normalizedQuery)}`,
"User-Agent": session.userAgent
}
},
"JD search session appears invalid. Re-login in the browser and re-import the cookie/header."
);
const candidates = parseJdSearchApiResponse(normalizedQuery, { text: response.text });
return {
query: normalizedQuery,
source: "api",
candidateCount: candidates.length,
candidates
};
}
const searchUrl = `https://search.jd.com/Search?keyword=${encodeURIComponent(normalizedQuery)}`;
const response = await fetchTextOrThrow(
searchUrl,
{
headers: {
Accept:
"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language": "zh-CN,zh;q=0.9",
Cookie: session.cookieHeader,
Referer: session.searchReferer ?? "https://www.jd.com/",
"User-Agent": session.userAgent
}
},
"JD search session appears invalid. Re-login in the browser and re-import the cookie/header."
);
const candidates = parseJdSearchHtml(normalizedQuery, response.text);
return {
query: normalizedQuery,
source: "html",
candidateCount: candidates.length,
candidates
};
}
async previewDetail(skuId: string): Promise<JdDetailPreviewResult> {
const session = this.requireSession();
const normalizedSkuId = skuId.trim();
if (!normalizedSkuId) {
throw new JdLiveError("skuId is required for JD detail preview.");
}
if (!session.detailTemplateUrl) {
throw new JdLiveError(
"JD detail template is missing. Capture a fresh pc_detailpage_wareBusiness request and import it first."
);
}
const templateSkuId = extractTemplateSkuId(session.detailTemplateUrl);
if (templateSkuId && templateSkuId !== normalizedSkuId) {
throw new JdLiveError(
`Imported detail template is bound to sku ${templateSkuId}. Open the matching JD item page and capture a fresh request for sku ${normalizedSkuId}.`
);
}
const response = await fetchTextOrThrow(
session.detailTemplateUrl,
{
headers: {
Accept: "application/json, text/plain, */*",
Cookie: session.cookieHeader,
Referer: session.detailReferer ?? `https://item.jd.com/${normalizedSkuId}.html`,
"User-Agent": session.userAgent
}
},
"JD detail session appears invalid. Re-login in the browser and re-import the cookie/header."
);
return {
skuId: normalizedSkuId,
source: "api",
detail: parseJdDetailApiResponse(normalizedSkuId, { text: response.text })
};
}
async previewReviews(
skuId: string,
commentCount = 5
): Promise<JdReviewsPreviewResult> {
const session = this.requireSession();
const normalizedSkuId = skuId.trim();
if (!normalizedSkuId) {
throw new JdLiveError("skuId is required for JD reviews preview.");
}
if (!session.reviewsTemplateUrl) {
throw new JdLiveError(
"JD reviews template is missing. Capture a fresh getLegoWareDetailComment request and import it first."
);
}
const templateSkuId = extractTemplateSkuId(session.reviewsTemplateUrl);
if (templateSkuId && templateSkuId !== normalizedSkuId) {
throw new JdLiveError(
`Imported reviews template is bound to sku ${templateSkuId}. Open the matching JD item page and capture a fresh request for sku ${normalizedSkuId}.`
);
}
const templateUrl = new URL(session.reviewsTemplateUrl);
const requestUrl = withUpdatedQueryBody(templateUrl, (body) => ({
...body,
commentNum: commentCount
}));
const response = await fetchTextOrThrow(
requestUrl,
{
headers: {
Accept: "application/json, text/plain, */*",
Cookie: session.cookieHeader,
Referer: session.detailReferer ?? `https://item.jd.com/${normalizedSkuId}.html`,
"User-Agent": session.userAgent
}
},
"JD reviews session appears invalid. Re-login in the browser and re-import the cookie/header."
);
return {
skuId: normalizedSkuId,
source: "api",
reviews: parseJdReviewsApiResponse(normalizedSkuId, { text: response.text })
};
}
private requireSession(): StoredJdLiveSession {
if (!this.session?.cookieHeader) {
throw new JdLiveError(
"JD live session is not configured. Import a browser cookie/header first."
);
}
return this.session;
}
}

View File

@ -0,0 +1,151 @@
import { readFileSync } from "node:fs";
import { describe, expect, it } from "vitest";
import {
parseJdDetailApiResponse,
parseJdReviewsApiResponse,
parseJdSearchApiResponse,
parseJdSearchHtml
} from "./parsers";
function readFixture(path: string): unknown {
return JSON.parse(readFileSync(new URL(path, import.meta.url), "utf8")) as unknown;
}
describe("JD parsers", () => {
it("parses real JD search API fixtures into candidate records", () => {
const fixture = readFixture("../../../../../jd-search-json-shape.json") as {
firstWare: Record<string, unknown>;
};
const candidates = parseJdSearchApiResponse("iPhone 15", {
data: {
wareList: [fixture.firstWare]
}
});
expect(candidates).toHaveLength(1);
expect(candidates[0]).toMatchObject({
platform: "jd",
title: expect.stringContaining("iPhone"),
priceLabel: expect.stringMatching(/^¥/),
storeName: expect.any(String)
});
expect(candidates[0]?.title).not.toContain("<font");
});
it("parses captured JD search card blocks from HTML markup", () => {
const fixture = readFixture("../../../../../jd-search-card-blocks.json") as Array<{
sku: string;
html: string;
}>;
const html = fixture.map((item) => item.html).join("");
const candidates = parseJdSearchHtml("iPhone 15", html);
expect(candidates).toHaveLength(3);
expect(candidates).toEqual(
expect.arrayContaining([
expect.objectContaining({
candidateId: "jd-100068388533",
storeName: "Apple产品京东自营旗舰店",
productUrl: "https://item.jd.com/100068388533.html",
salesHint: expect.stringContaining("已售500万+")
})
])
);
});
it("parses JD detail payloads from raw API objects", () => {
const detail = parseJdDetailApiResponse("100068388533", {
wareInfo: {
wareInfoMap: {
sku_status: 1
}
},
skuHeadVO: {
skuTitle: "Apple/苹果 iPhone 15 (A3092) 128GB 绿色 支持移动联通电信5G 双卡双待手机"
},
price: {
p: "4398.00",
op: "4599.00"
},
itemShopInfo: {
shopName: "Apple产品京东自营旗舰店"
},
crumbInfoVO: {
crumbs: [
{ text: "手机通讯" },
{ text: "手机" },
{ text: "Apple" }
]
},
stockInfo: {
stockStateDesc: "<strong>有货</strong>仅剩318件"
},
mainImageVO: {
mainImageArea: {
imageUrl: "jfs/t1/example.jpg"
}
}
});
expect(detail).toMatchObject({
skuId: "100068388533",
title: "Apple/苹果 iPhone 15 (A3092) 128GB 绿色 支持移动联通电信5G 双卡双待手机",
price: "4398.00",
originalPrice: "4599.00",
shopName: "Apple产品京东自营旗舰店",
categoryPath: ["手机通讯", "手机", "Apple"],
stockState: "有货仅剩318件",
mainImage: "https://img14.360buyimg.com/n2/jfs/t1/example.jpg"
});
});
it("parses JD reviews payloads from raw API objects", () => {
const reviews = parseJdReviewsApiResponse("100068388533", {
allCnt: "10000",
goodRate: "95%",
pictureCnt: "500",
tagStatisticsinfoList: [
{
tagId: "tag-1",
name: "拍照效果超清晰",
count: "9313"
},
{
tagId: "tag-2",
name: "手感很舒服",
count: "8628"
}
],
commentInfoList: [
{
commentId: "103893190162198263",
commentData: "蓝色 iPhone 15 颜值很高。",
commentScore: 5,
commentDate: "2026-04-02 19:23:16",
userLevelName: "PLUS会员"
}
]
});
expect(reviews).toMatchObject({
skuId: "100068388533",
total: "10000",
goodRate: "95%",
pictureCount: "500"
});
expect(reviews.tags[0]).toMatchObject({
tagId: "tag-1",
name: "拍照效果超清晰",
count: "9313"
});
expect(reviews.comments[0]).toMatchObject({
id: "103893190162198263",
content: "蓝色 iPhone 15 颜值很高。",
score: "5",
userLevelName: "PLUS会员"
});
});
});

View File

@ -0,0 +1,361 @@
import type { CandidateRecord } from "@cross-ai/domain";
import type {
JdProductDetailSnapshot,
JdProductReviewsSnapshot,
JdReviewCommentSnapshot,
JdReviewTagSnapshot
} from "./types";
import {
absolutizeUrl,
asArray,
asRecord,
firstString,
normalizeWhitespace,
stringFrom,
uniqueStrings
} from "./utils";
function unwrapCapturedPayload(input: unknown): unknown {
const record = asRecord(input);
const text = stringFrom(record?.text);
if (!text) {
return input;
}
try {
return JSON.parse(text) as unknown;
} catch {
return input;
}
}
function extractSpecLabel(title: string): string {
const storageMatch = title.match(/\b\d+(?:GB|TB)\b/i);
if (storageMatch) {
return storageMatch[0].toUpperCase();
}
const colorMatch = title.match(/(黑色|白色|蓝色|粉色|绿色|黄色|紫色|原色|钛金属)/);
if (colorMatch) {
return colorMatch[0];
}
return "标准版";
}
function normalizePriceText(value: string | null): { value: number; label: string } | null {
if (!value) {
return null;
}
const stripped = value.replace(/[^\d.]/g, "");
if (!stripped) {
return null;
}
const parsed = Number.parseFloat(stripped);
if (Number.isNaN(parsed)) {
return null;
}
return {
value: parsed,
label: `¥${parsed.toString()}`
};
}
function normalizeInlineText(value: string | null): string | null {
if (!value) {
return null;
}
const normalized = normalizeWhitespace(value).replace(/\s+([,。;:、])/g, "$1");
return normalized || null;
}
function matchFirst(value: string, pattern: RegExp): string | null {
const match = pattern.exec(value);
return match?.[1] ? normalizeWhitespace(match[1]) : null;
}
function extractSearchCardBlocks(html: string): string[] {
const matches = html.matchAll(
/<div[^>]*plugin_goodsCardWrapper[^>]*data-sku="[^"]+"[\s\S]*?(?=<div[^>]*plugin_goodsCardWrapper[^>]*data-sku="|$)/g
);
return Array.from(matches, (match) => match[0]);
}
function parseSearchCardBlock(block: string): CandidateRecord | null {
const sku = matchFirst(block, /data-sku="([^"]+)"/);
const title =
matchFirst(
block,
/_goods_title_container_[^"]*"[\s\S]*?<span[^>]*title="([^"]+)"/
) ?? matchFirst(block, /title="([^"]+)"/);
const priceText = matchFirst(block, /<span class="[^"]*_price_[^"]*"[^>]*>([\s\S]*?)<\/span>/);
const price = normalizePriceText(priceText);
const storeName = matchFirst(
block,
/<span class="[^"]*_name_[^"]*"[\s\S]*?<span>([\s\S]*?)<\/span>/
);
const imageUrl = absolutizeUrl(
matchFirst(block, /<img[^>]+src="([^"]+)"[^>]*alt="">/) ??
matchFirst(block, /<img[^>]+data-src="([^"]+)"/)
);
const soldHint = matchFirst(block, /title="(已售[^"]+)"/);
const trendHint = matchFirst(block, /title="(30天[^"]+)"/);
const featureMatches = Array.from(
block.matchAll(/<span>([^<]+)<\/span>/g),
(match) => normalizeWhitespace(match[1] ?? "")
).filter(Boolean);
const highlights = uniqueStrings(featureMatches).slice(0, 4);
if (!sku || !title || !price || !storeName) {
return null;
}
return {
candidateId: `jd-${sku}`,
platform: "jd",
title,
price: price.value,
priceLabel: price.label,
storeName,
productUrl: `https://item.jd.com/${sku}.html`,
imageUrl: imageUrl ?? "https://placehold.co/640x480?text=JD",
salesHint: uniqueStrings([soldHint, trendHint]).join(" · ") || "暂无销量信息",
specLabel: extractSpecLabel(title),
highlights: highlights.length > 0 ? highlights : ["京东商品卡片已命中"]
};
}
export function parseJdSearchHtml(query: string, html: string): CandidateRecord[] {
const blocks = extractSearchCardBlocks(html);
const seen = new Set<string>();
const candidates: CandidateRecord[] = [];
for (const block of blocks) {
const candidate = parseSearchCardBlock(block);
if (!candidate || seen.has(candidate.candidateId)) {
continue;
}
seen.add(candidate.candidateId);
candidates.push(candidate);
}
if (candidates.length > 0) {
return candidates;
}
if (html.includes("暂无") || html.includes("很抱歉没有找到")) {
return [];
}
return [
{
candidateId: `jd-fallback-${encodeURIComponent(query)}`,
platform: "jd",
title: query,
price: 0,
priceLabel: "¥0",
storeName: "京东",
productUrl: `https://search.jd.com/Search?keyword=${encodeURIComponent(query)}`,
imageUrl: "https://placehold.co/640x480?text=JD",
salesHint: "页面已返回,但未解析出稳定商品卡片",
specLabel: "待确认",
highlights: ["需要刷新搜索模板或调整解析器"]
}
];
}
export function parseJdSearchApiResponse(query: string, input: unknown): CandidateRecord[] {
const payload = asRecord(unwrapCapturedPayload(input));
const data = asRecord(payload?.data);
const wareList = asArray(data?.wareList);
const seen = new Set<string>();
const candidates: CandidateRecord[] = [];
for (const item of wareList) {
const ware = asRecord(item);
const sku = stringFrom(ware?.skuId);
const title = normalizeInlineText(firstString(ware?.wareName, ware?.wname));
const priceText = firstString(
ware?.jdPrice,
asRecord(ware?.finalPrice)?.estimatedPrice,
ware?.price
);
const price = normalizePriceText(priceText);
if (!sku || !title || !price || seen.has(sku)) {
continue;
}
const storeName =
normalizeInlineText(firstString(ware?.shopName, ware?.storeName, "京东店铺")) ??
"京东店铺";
const totalSales = stringFrom(ware?.totalSales);
const commentFuzzy = firstString(ware?.commentFuzzy, ware?.comment);
const highlights = uniqueStrings([
stringFrom(ware?.selfSupport) === "1" ? "京东自营" : null,
stringFrom(ware?.good),
stringFrom(ware?.averageScore) ? `评分 ${stringFrom(ware?.averageScore)}` : null
]);
seen.add(sku);
candidates.push({
candidateId: `jd-${sku}`,
platform: "jd",
title,
price: price.value,
priceLabel: price.label,
storeName,
productUrl: `https://item.jd.com/${sku}.html`,
imageUrl:
absolutizeUrl(stringFrom(ware?.imageurl)) ??
"https://placehold.co/640x480?text=JD",
salesHint:
uniqueStrings([
totalSales ? `已售${totalSales}` : null,
commentFuzzy ? `累计评价 ${commentFuzzy}` : null
]).join(" · ") || "暂无销量信息",
specLabel: extractSpecLabel(title),
highlights: highlights.length > 0 ? highlights : ["京东 API 返回候选"]
});
}
return candidates;
}
export function parseJdDetailApiResponse(
skuId: string,
input: unknown
): JdProductDetailSnapshot {
const payload = asRecord(unwrapCapturedPayload(input));
const wareInfo = asRecord(payload?.wareInfo);
const wareInfoMap = asRecord(wareInfo?.wareInfoMap);
const price = asRecord(payload?.price);
const finalPrice = asRecord(price?.finalPrice);
const itemShopInfo = asRecord(payload?.itemShopInfo);
const crumbInfo = asRecord(payload?.crumbInfoVO);
const stockInfo = asRecord(payload?.stockInfo) ?? asRecord(payload?.stockVO);
const mainImage = asRecord(payload?.mainImageVO);
const mainImageArea = asRecord(mainImage?.mainImageArea);
const skuHead = asRecord(payload?.skuHeadVO);
const categoryPath = asArray(crumbInfo?.crumbs)
.map((item) => {
const crumb = asRecord(item);
return firstString(crumb?.text, crumb?.name);
})
.filter((item): item is string => Boolean(item));
return {
skuId: firstString(wareInfo?.skuId, wareInfoMap?.skuId, skuId) ?? skuId,
title: normalizeInlineText(
firstString(
skuHead?.skuTitle,
wareInfo?.wname,
wareInfo?.name,
wareInfoMap?.wname,
wareInfoMap?.name,
skuHead?.seoTitle
)
),
price: firstString(price?.p, price?.price),
originalPrice: firstString(price?.op, price?.originalPrice),
estimatedPrice: firstString(finalPrice?.estimatedPrice, finalPrice?.price),
shopName: normalizeInlineText(
firstString(itemShopInfo?.shopName, itemShopInfo?.venderName)
),
vendorId: firstString(itemShopInfo?.venderId, itemShopInfo?.shopId),
categoryPath,
stockState: normalizeInlineText(
firstString(
stockInfo?.stockStateDesc,
stockInfo?.stockDesc,
stockInfo?.stockStateName,
stockInfo?.stockState
)
),
mainImage: absolutizeUrl(
firstString(
mainImageArea?.imageUrl,
mainImage?.mainImageUrl,
mainImage?.mainImage,
asRecord(asArray(mainImage?.carouselArea)[0])?.imageUrl,
wareInfo?.imageUrl,
wareInfoMap?.imageUrl
)
),
averageScore: firstString(
wareInfo?.averageScore,
wareInfo?.score,
wareInfoMap?.averageScore,
wareInfoMap?.score
)
};
}
function parseReviewTag(input: unknown): JdReviewTagSnapshot | null {
const tag = asRecord(input);
const name = firstString(tag?.name, tag?.tagName);
if (!name) {
return null;
}
return {
tagId: firstString(tag?.tagId, tag?.id),
name,
count: firstString(tag?.count, tag?.num)
};
}
function parseReviewComment(input: unknown): JdReviewCommentSnapshot | null {
const comment = asRecord(input);
const content = normalizeInlineText(
firstString(comment?.content, comment?.commentData, comment?.tagCommentContent)
);
const id = firstString(comment?.id, comment?.commentId);
if (!content || !id) {
return null;
}
return {
id,
content,
score: firstString(comment?.score, comment?.commentScore),
creationTime: firstString(
comment?.creationTime,
comment?.creationDate,
comment?.commentDate
),
userLevelName: normalizeInlineText(
firstString(comment?.userLevelName, comment?.userClientShow)
)
};
}
export function parseJdReviewsApiResponse(
skuId: string,
input: unknown
): JdProductReviewsSnapshot {
const payload = asRecord(unwrapCapturedPayload(input));
const tags = asArray(payload?.tagStatisticsinfoList)
.map((tag) => parseReviewTag(tag))
.filter((tag): tag is JdReviewTagSnapshot => Boolean(tag));
const comments = asArray(payload?.commentInfoList)
.map((comment) => parseReviewComment(comment))
.filter((comment): comment is JdReviewCommentSnapshot => Boolean(comment));
return {
skuId,
total: firstString(payload?.allCnt, payload?.allCntStr, payload?.goodCnt),
goodRate: firstString(payload?.goodRate, payload?.goodRateShow),
pictureCount: firstString(payload?.pictureCnt, payload?.showPicCnt),
tags,
comments
};
}

View File

@ -0,0 +1,93 @@
import type { CandidateRecord } from "@cross-ai/domain";
export type JdSearchMode = "html" | "api";
export interface JdTemplateSummary {
available: boolean;
skuId?: string | undefined;
}
export interface JdLiveSessionInput {
cookieHeader: string;
userAgent?: string | undefined;
searchApiTemplateUrl?: string | undefined;
detailTemplateUrl?: string | undefined;
reviewsTemplateUrl?: string | undefined;
searchReferer?: string | undefined;
detailReferer?: string | undefined;
}
export interface JdLiveSessionSummary {
configured: boolean;
importedAt?: string | undefined;
hasCookie: boolean;
userAgent?: string | undefined;
searchApiTemplate: JdTemplateSummary;
detailTemplate: JdTemplateSummary;
reviewsTemplate: JdTemplateSummary;
}
export interface JdSearchPreviewResult {
query: string;
source: JdSearchMode;
candidateCount: number;
candidates: CandidateRecord[];
}
export interface JdProductDetailSnapshot {
skuId: string;
title: string | null;
price: string | null;
originalPrice: string | null;
estimatedPrice: string | null;
shopName: string | null;
vendorId: string | null;
categoryPath: string[];
stockState: string | null;
mainImage: string | null;
averageScore: string | null;
}
export interface JdReviewTagSnapshot {
tagId: string | null;
name: string;
count: string | null;
}
export interface JdReviewCommentSnapshot {
id: string;
content: string;
score: string | null;
creationTime: string | null;
userLevelName: string | null;
}
export interface JdProductReviewsSnapshot {
skuId: string;
total: string | null;
goodRate: string | null;
pictureCount: string | null;
tags: JdReviewTagSnapshot[];
comments: JdReviewCommentSnapshot[];
}
export interface JdDetailPreviewResult {
skuId: string;
source: "api";
detail: JdProductDetailSnapshot;
}
export interface JdReviewsPreviewResult {
skuId: string;
source: "api";
reviews: JdProductReviewsSnapshot;
}
export interface JdLiveService {
getSessionSummary(): JdLiveSessionSummary;
importSession(input: JdLiveSessionInput): JdLiveSessionSummary;
clearSession(): void;
previewSearch(query: string, mode?: JdSearchMode): Promise<JdSearchPreviewResult>;
previewDetail(skuId: string): Promise<JdDetailPreviewResult>;
previewReviews(skuId: string, commentCount?: number): Promise<JdReviewsPreviewResult>;
}

View File

@ -0,0 +1,115 @@
export function asRecord(value: unknown): Record<string, unknown> | null {
if (!value || typeof value !== "object" || Array.isArray(value)) {
return null;
}
return value as Record<string, unknown>;
}
export function asArray(value: unknown): unknown[] {
return Array.isArray(value) ? value : [];
}
export function stringFrom(value: unknown): string | null {
if (typeof value === "string") {
const trimmed = value.trim();
return trimmed.length > 0 ? trimmed : null;
}
if (typeof value === "number" || typeof value === "boolean") {
return String(value);
}
return null;
}
export function firstString(...values: unknown[]): string | null {
for (const value of values) {
const resolved = stringFrom(value);
if (resolved) {
return resolved;
}
}
return null;
}
export function absolutizeUrl(value: string | null | undefined): string | null {
if (!value) {
return null;
}
if (value.startsWith("http://") || value.startsWith("https://")) {
return value;
}
if (value.startsWith("//")) {
return `https:${value}`;
}
if (value.startsWith("/")) {
return `https://www.jd.com${value}`;
}
if (value.startsWith("jfs/") || value.startsWith("t1/") || value.startsWith("t202")) {
return `https://img14.360buyimg.com/n2/${value}`;
}
return `https://${value}`;
}
export function decodeHtmlEntities(value: string): string {
return value
.replace(/&nbsp;/g, " ")
.replace(/&amp;/g, "&")
.replace(/&quot;/g, '"')
.replace(/&#39;/g, "'")
.replace(/&lt;/g, "<")
.replace(/&gt;/g, ">");
}
export function stripTags(value: string): string {
return value.replace(/<[^>]+>/g, " ");
}
export function normalizeWhitespace(value: string): string {
return decodeHtmlEntities(stripTags(value)).replace(/\s+/g, " ").trim();
}
export function uniqueStrings(values: Array<string | null | undefined>): string[] {
return Array.from(
new Set(
values
.map((value) => value?.trim())
.filter((value): value is string => Boolean(value))
)
);
}
export function parseEmbeddedJson(value: string | null): Record<string, unknown> | null {
if (!value) {
return null;
}
try {
const parsed = JSON.parse(value) as unknown;
return asRecord(parsed);
} catch {
return null;
}
}
export function readQueryBody(url: URL): Record<string, unknown> | null {
return parseEmbeddedJson(url.searchParams.get("body"));
}
export function withUpdatedQueryBody(
url: URL,
updater: (body: Record<string, unknown>) => Record<string, unknown>
): string {
const body = readQueryBody(url) ?? {};
const next = updater(body);
const nextUrl = new URL(url.toString());
nextUrl.searchParams.set("body", JSON.stringify(next));
return nextUrl.toString();
}

View File

@ -0,0 +1,288 @@
import { describe, expect, it } from "vitest";
import type {
JdDetailPreviewResult,
JdLiveService,
JdLiveSessionSummary,
JdReviewsPreviewResult,
JdSearchPreviewResult
} from "./platforms/jd/types";
import { createServer } from "./server";
function createJdLiveServiceStub(
overrides: Partial<JdLiveService> = {}
): JdLiveService {
let summary: JdLiveSessionSummary = {
configured: false,
hasCookie: false,
searchApiTemplate: { available: false },
detailTemplate: { available: false },
reviewsTemplate: { available: false }
};
return {
getSessionSummary() {
return overrides.getSessionSummary?.() ?? summary;
},
importSession(input) {
if (overrides.importSession) {
return overrides.importSession(input);
}
summary = {
configured: true,
importedAt: "2026-04-02T12:00:00.000Z",
hasCookie: true,
userAgent: input.userAgent ?? "stub-user-agent",
searchApiTemplate: { available: Boolean(input.searchApiTemplateUrl) },
detailTemplate: { available: Boolean(input.detailTemplateUrl) },
reviewsTemplate: { available: Boolean(input.reviewsTemplateUrl) }
};
return summary;
},
clearSession() {
if (overrides.clearSession) {
overrides.clearSession();
return;
}
summary = {
configured: false,
hasCookie: false,
searchApiTemplate: { available: false },
detailTemplate: { available: false },
reviewsTemplate: { available: false }
};
},
async previewSearch(query, mode) {
if (overrides.previewSearch) {
return overrides.previewSearch(query, mode);
}
const preview: JdSearchPreviewResult = {
query,
source: "api",
candidateCount: 1,
candidates: [
{
candidateId: "jd-100068388533",
platform: "jd",
title: "Apple iPhone 15",
price: 3898,
priceLabel: "CNY 3898",
storeName: "JD Self Operated",
productUrl: "https://item.jd.com/100068388533.html",
imageUrl: "https://img14.360buyimg.com/n2/jfs/t1/example.jpg",
salesHint: "sold 500+",
specLabel: "128GB",
highlights: ["A16"]
}
]
};
return preview;
},
async previewDetail(skuId) {
if (overrides.previewDetail) {
return overrides.previewDetail(skuId);
}
const preview: JdDetailPreviewResult = {
skuId,
source: "api",
detail: {
skuId,
title: "Apple iPhone 15",
price: "4398.00",
originalPrice: "4599.00",
estimatedPrice: "3898",
shopName: "JD Self Operated",
vendorId: null,
categoryPath: ["phones", "smartphones", "apple"],
stockState: "in stock",
mainImage: "https://img14.360buyimg.com/n2/jfs/t1/example.jpg",
averageScore: null
}
};
return preview;
},
async previewReviews(skuId, commentCount) {
if (overrides.previewReviews) {
return overrides.previewReviews(skuId, commentCount);
}
const preview: JdReviewsPreviewResult = {
skuId,
source: "api",
reviews: {
skuId,
total: "10000",
goodRate: "95%",
pictureCount: "500",
tags: [
{
tagId: "tag-1",
name: "clear camera",
count: "9313"
}
],
comments: [
{
id: "comment-1",
content: "smooth system and sharp photos",
score: "5",
creationTime: "2026-04-02 19:23:16",
userLevelName: "PLUS"
}
]
}
};
return preview;
}
};
}
describe("JD live server endpoints", () => {
it("imports and clears a JD live session through dedicated endpoints", async () => {
const jdLiveService = createJdLiveServiceStub();
const app = createServer({ jdLiveService });
await app.ready();
const importResponse = await app.inject({
method: "POST",
url: "/api/platforms/jd/live-session",
payload: {
cookieHeader: "thor=masked; pin=masked;",
searchApiTemplateUrl:
"https://api.m.jd.com/?functionId=pc_search_searchWare&body=%7B%22keyword%22:%22iphone%2015%22%7D",
detailTemplateUrl:
"https://api.m.jd.com/?functionId=pc_detailpage_wareBusiness&body=%7B%22skuId%22:%22100068388533%22%7D",
reviewsTemplateUrl:
"https://api.m.jd.com/?functionId=getLegoWareDetailComment&body=%7B%22sku%22:100068388533%7D"
}
});
expect(importResponse.statusCode).toBe(200);
expect(importResponse.json().session).toMatchObject({
configured: true,
hasCookie: true,
searchApiTemplate: {
available: true
},
detailTemplate: {
available: true
},
reviewsTemplate: {
available: true
}
});
const summaryResponse = await app.inject({
method: "GET",
url: "/api/platforms/jd/live-session"
});
expect(summaryResponse.statusCode).toBe(200);
expect(summaryResponse.json().session).toMatchObject({
configured: true,
hasCookie: true
});
const readinessResponse = await app.inject({
method: "GET",
url: "/api/platforms/readiness"
});
expect(
readinessResponse
.json()
.platforms.find((platform: { platform: string }) => platform.platform === "jd")
).toMatchObject({
platform: "jd",
ready: true,
status: "ready"
});
const clearResponse = await app.inject({
method: "DELETE",
url: "/api/platforms/jd/live-session"
});
expect(clearResponse.statusCode).toBe(204);
const clearedSummaryResponse = await app.inject({
method: "GET",
url: "/api/platforms/jd/live-session"
});
expect(clearedSummaryResponse.json().session).toMatchObject({
configured: false,
hasCookie: false
});
await app.close();
});
it("exposes JD live preview endpoints through the injected service", async () => {
const jdLiveService = createJdLiveServiceStub();
const app = createServer({ jdLiveService });
await app.ready();
const searchResponse = await app.inject({
method: "GET",
url: "/api/platforms/jd/live-search-preview?query=iPhone%2015"
});
expect(searchResponse.statusCode).toBe(200);
expect(searchResponse.json().preview).toMatchObject({
query: "iPhone 15",
source: "api",
candidateCount: 1
});
const detailResponse = await app.inject({
method: "GET",
url: "/api/platforms/jd/live-detail-preview?skuId=100068388533"
});
expect(detailResponse.statusCode).toBe(200);
expect(detailResponse.json().preview.detail).toMatchObject({
skuId: "100068388533",
shopName: "JD Self Operated"
});
const reviewsResponse = await app.inject({
method: "GET",
url: "/api/platforms/jd/live-reviews-preview?skuId=100068388533&commentCount=3"
});
expect(reviewsResponse.statusCode).toBe(200);
expect(reviewsResponse.json().preview.reviews).toMatchObject({
skuId: "100068388533",
goodRate: "95%"
});
await app.close();
});
it("surfaces JD live preview failures with service-provided status codes", async () => {
const jdLiveService = createJdLiveServiceStub({
async previewDetail() {
const error = new Error("Imported detail template is bound to another sku.") as Error & {
statusCode: number;
};
error.statusCode = 409;
throw error;
}
});
const app = createServer({ jdLiveService });
await app.ready();
const response = await app.inject({
method: "GET",
url: "/api/platforms/jd/live-detail-preview?skuId=100068388533"
});
expect(response.statusCode).toBe(409);
expect(response.json()).toMatchObject({
message: "Imported detail template is bound to another sku."
});
await app.close();
});
});

View File

@ -6,11 +6,14 @@ import type {
import cors from "@fastify/cors";
import Fastify from "fastify";
import { JdLiveSessionService, isJdLiveError } from "./platforms/jd/live-session";
import type { JdLiveService, JdSearchMode } from "./platforms/jd/types";
import { InMemoryTaskStore } from "./store";
export function createServer() {
export function createServer(options: { jdLiveService?: JdLiveService } = {}) {
const app = Fastify({ logger: false });
const store = new InMemoryTaskStore();
const jdLiveService = options.jdLiveService ?? new JdLiveSessionService();
app.register(cors, { origin: true });
@ -54,11 +57,49 @@ export function createServer() {
};
});
app.get("/api/platforms/jd/live-session", async () => ({
session: jdLiveService.getSessionSummary()
}));
app.post<{
Body: {
cookieHeader: string;
userAgent?: string;
searchApiTemplateUrl?: string;
detailTemplateUrl?: string;
reviewsTemplateUrl?: string;
searchReferer?: string;
detailReferer?: string;
};
}>("/api/platforms/jd/live-session", async (request, reply) => {
try {
const session = jdLiveService.importSession(request.body);
store.preparePlatform("jd");
reply.code(200);
return { session };
} catch (error) {
reply.code(isJdLiveError(error) ? error.statusCode : 400);
return {
message: error instanceof Error ? error.message : "Invalid JD live session payload."
};
}
});
app.delete("/api/platforms/jd/live-session", async (_request, reply) => {
jdLiveService.clearSession();
store.clearPlatformSession("jd");
reply.code(204);
return null;
});
app.delete<{
Params: { platform: PlatformId };
}>("/api/sessions/:platform", async (request, reply) => {
try {
store.clearPlatformSession(request.params.platform);
if (request.params.platform === "jd") {
jdLiveService.clearSession();
}
reply.code(204);
return null;
} catch {
@ -172,6 +213,72 @@ export function createServer() {
return { report };
});
app.get<{
Querystring: { query?: string; mode?: JdSearchMode };
}>("/api/platforms/jd/live-search-preview", async (request, reply) => {
try {
const query = request.query.query?.trim();
if (!query) {
reply.code(400);
return { message: "query is required." };
}
const preview = await jdLiveService.previewSearch(query, request.query.mode);
return { preview };
} catch (error) {
reply.code(isJdLiveError(error) ? error.statusCode : 502);
return {
message:
error instanceof Error ? error.message : "JD live search preview failed."
};
}
});
app.get<{
Querystring: { skuId?: string };
}>("/api/platforms/jd/live-detail-preview", async (request, reply) => {
try {
const skuId = request.query.skuId?.trim();
if (!skuId) {
reply.code(400);
return { message: "skuId is required." };
}
const preview = await jdLiveService.previewDetail(skuId);
return { preview };
} catch (error) {
reply.code(isJdLiveError(error) ? error.statusCode : 502);
return {
message:
error instanceof Error ? error.message : "JD live detail preview failed."
};
}
});
app.get<{
Querystring: { skuId?: string; commentCount?: string };
}>("/api/platforms/jd/live-reviews-preview", async (request, reply) => {
try {
const skuId = request.query.skuId?.trim();
if (!skuId) {
reply.code(400);
return { message: "skuId is required." };
}
const commentCount = request.query.commentCount
? Number.parseInt(request.query.commentCount, 10)
: undefined;
const preview = await jdLiveService.previewReviews(skuId, commentCount);
return { preview };
} catch (error) {
reply.code(isJdLiveError(error) ? error.statusCode : 502);
return {
message:
error instanceof Error ? error.message : "JD live reviews preview failed."
};
}
});
app.get("/api/history", async () => ({
tasks: store.listHistory()
}));

View File

@ -73,12 +73,15 @@
4. 同时可观察到 `cactus.jd.com/request_algo``jra.jd.com/jsTk.do``sgm-w.jd.com/h5` 等风控/参数初始化请求。
5. `https://item.jd.com/robots.txt` 的公开信息非常有限,不能据此推断搜索或详情路径可稳定匿名抓取。
参考:<https://item.jd.com/robots.txt>
6. 2026-04-02 登录后复测中,`https://search.jd.com/Search?keyword=iPhone%2015` 返回的是 Vite 壳页 HTML响应内不再稳定包含商品卡片同 query 的 `pc_search_searchWare` API 回放可稳定返回 30 条候选。
7. 同次复测中,`pc_detailpage_wareBusiness``getLegoWareDetailComment` 在授权会话下可稳定返回价格、店铺、库存、主图、标签与评论正文。
判断:
1. 京东 PC 端存在明确的接口层,不是只能靠浏览器 DOM 抓。
1. 京东 PC 端存在明确的接口层,不是只能靠浏览器 DOM 抓;其中搜索页 HTML 已明显退化为前端壳页,不能再把稳定 DOM 解析当作默认主路径
2. 但接口调用明显依赖会话与动态参数上下文,不能把它当作无状态公开接口。
3. 京东的非浏览器路线是可行的,但必须建立在“先会话、后请求”的体系上。
3. 当前更稳的默认路线应收敛为“授权会话下的搜索/详情/评论 API 回放”,浏览器负责登录、模板刷新与阻塞恢复。
4. 因此京东的非浏览器路线是可行的,但必须建立在“先会话、后请求”的体系上。
### 3.3 淘宝