douyin-crawler-poc/docs/superpowers/plans/2026-04-17-douyin-zero-arg-target-detection.md

5.2 KiB

Douyin Zero-Argument Target Detection Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Make Douyin.py work with zero arguments by default, auto-detect the current browser page target, and keep a single manual fallback input for creator URLs, video URLs, or aweme_id.

Architecture: Add a target-resolution layer ahead of the existing crawl logic. Route resolved targets into either a visible-only creator flow or a single-video flow, keeping browser-attach checks and download primitives reusable.

Tech Stack: Python 3, argparse, re, socket, pathlib, unittest


Task 1: Revise the requirements and freeze the contract

Files:

  • Modify: externaldocs/2026-04-17-douyin-targeted-crawling-requirements.md

  • Create: docs/superpowers/specs/2026-04-17-douyin-zero-arg-target-detection-design.md

  • Step 1: Align the requirements doc with the approved UX

Document that ./.venv/bin/python Douyin.py is the primary command and that manual input is fallback-only.

  • Step 2: Save the approved design as a spec

Write the validated design into docs/superpowers/specs/2026-04-17-douyin-zero-arg-target-detection-design.md.

  • Step 3: Review both docs locally

Read both files and confirm the language matches the agreed zero-argument flow and visible-only scope.

Task 2: Add failing tests for target parsing and target resolution

Files:

  • Modify: test_douyin.py

  • Modify: Douyin.py

  • Step 1: Write the failing tests

Add tests for:

  • is_creator_url() accepts supported creator URLs

  • is_video_url() accepts supported video URLs

  • is_aweme_id() accepts numeric IDs

  • parse_target_input() classifies creator URLs, video URLs, and aweme_id

  • resolve_target() uses the active browser page when CLI input is absent

  • resolve_target() raises a readable error when neither the current page nor the manual input is supported

  • Step 2: Run the focused tests to verify RED

Run: python3 -m unittest test_douyin.py -q Expected: FAIL because the new target-resolution helpers do not exist yet.

  • Step 3: Write the minimal implementation

Implement the smallest set of pure helper functions and a compact parsed-target structure in Douyin.py.

  • Step 4: Run the focused tests to verify GREEN

Run: python3 -m unittest test_douyin.py -q Expected: PASS

Task 3: Add failing tests for current-page behavior and visible-only creator flow

Files:

  • Modify: test_douyin.py

  • Modify: Douyin.py

  • Step 1: Write the failing tests

Add tests for:

  • current-page creator mode does not auto-scroll by default

  • creator flow reports a clear error when no aweme items are available

  • Step 2: Run the focused tests to verify RED

Run: python3 -m unittest test_douyin.py -q Expected: FAIL because the current creator flow still scrolls automatically.

  • Step 3: Write the minimal implementation

Split creator crawling so the default path only processes the currently loaded response set and does not call scroll helpers automatically.

  • Step 4: Run the focused tests to verify GREEN

Run: python3 -m unittest test_douyin.py -q Expected: PASS

Task 4: Add failing tests for single-video flow

Files:

  • Modify: test_douyin.py

  • Modify: Douyin.py

  • Step 1: Write the failing tests

Add tests for:

  • resolving a video URL leads to a single-video target

  • resolving an aweme_id leads to a single-video target

  • single-video flow downloads exactly one file

  • Step 2: Run the focused tests to verify RED

Run: python3 -m unittest test_douyin.py -q Expected: FAIL because single-video execution path does not exist yet.

  • Step 3: Write the minimal implementation

Implement single-video resolution and a narrow download path that saves one mp4 file.

  • Step 4: Run the focused tests to verify GREEN

Run: python3 -m unittest test_douyin.py -q Expected: PASS

Task 5: Update CLI entry behavior and verify end to end

Files:

  • Modify: Douyin.py

  • Modify: test_douyin.py

  • Modify: README.md

  • Modify: externaldocs/beginner-guide.md

  • Step 1: Write the failing tests

Add tests for:

  • default CLI invocation with no positional target chooses current-page resolution

  • unsupported current page produces a fallback hint

  • manual positional target overrides the current page

  • Step 2: Run the focused tests to verify RED

Run: python3 -m unittest test_douyin.py -q Expected: FAIL because the CLI still assumes a default hardcoded creator URL.

  • Step 3: Write the minimal implementation

Update the parser and main() flow so zero-argument execution becomes the default, while keeping the manual positional target as fallback.

  • Step 4: Update user docs

Revise README.md and externaldocs/beginner-guide.md to show the new default flow:

./.venv/bin/python login_douyin.py
./.venv/bin/python Douyin.py
  • Step 5: Run full verification

Run: python3 -m unittest -q Expected: PASS for the full test suite.