3.1 KiB
Xiaohongshu Browser Feed Download Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Build a first usable Xiaohongshu downloader that attaches to a manually logged-in Chrome session, listens for feed responses, extracts mp4 URLs, and downloads a limited number of videos.
Architecture: Add login_xhs.py for visible Chrome startup and XHS.py for browser attachment, feed listening, parsing, scrolling, and downloading. Keep browser/runtime imports lazy so unit tests run without Chrome automation dependencies.
Tech Stack: Python 3, unittest, requests, DrissionPage, macOS Chrome remote debugging.
File Structure
- Create
login_xhs.py: Chrome launch CLI, debug-port readiness wait, user-facing next command. - Create
XHS.py: parsing helpers, browser attachment helpers, downloader CLI, feed collection loop. - Create
test_login_xhs.py: command construction and CLI behavior tests. - Create
test_xhs.py: parser, filename, output path, browser address, and port-check tests. - Modify
README.md: install and usage instructions.
Task 1: Login Browser Entrypoint
Files:
-
Create:
test_login_xhs.py -
Create:
login_xhs.py -
Write failing tests for Chrome command construction, parser defaults, profile creation, and missing Chrome path handling.
-
Run
python3 -m unittest test_login_xhs.py -vand verify the tests fail becauselogin_xhsdoes not exist. -
Implement
login_xhs.pywith lazy, testable functions matching the Douyin project pattern. -
Run
python3 -m unittest test_login_xhs.py -vand verify it passes.
Task 2: Pure XHS Parsing Helpers
Files:
-
Create:
test_xhs.py -
Create:
XHS.py -
Write failing tests for filename sanitization, UTF-8 truncation, choosing video URLs, recursive extraction from a nested feed payload, output path generation, browser address construction, and debug-port validation.
-
Run
python3 -m unittest test_xhs.py -vand verify the tests fail becauseXHSdoes not exist. -
Implement pure helpers and lazy dependency imports in
XHS.py. -
Run
python3 -m unittest test_xhs.py -vand verify it passes.
Task 3: Browser Feed Collection CLI
Files:
-
Modify:
XHS.py -
Modify:
test_xhs.py -
Write failing tests for feed payload extraction from packet response objects and the argument parser defaults.
-
Run
python3 -m unittest test_xhs.py -vand verify the new tests fail. -
Implement DrissionPage attachment, feed listening, gentle scrolling, download loop, and CLI
main. -
Run
python3 -m unittest test_xhs.py test_login_xhs.py -vand verify it passes.
Task 4: README and Final Verification
Files:
-
Modify:
README.md -
Update README with setup, login, download, output, and compliance notes.
-
Run
python3 -m unittest test_xhs.py test_login_xhs.py -v. -
Run
git status --short --branch. -
Commit all implementation changes.
-
Push to
origin/main.