
Use yt-dlp to batch download YouTube channel / playlist videos to build training corpus.
Collect hundreds of thousands / millions of video URLs by keyword, language, region and download audio and video.
Crawl only metadata (title, description, tags, subtitles, comments), no need for full video.
Multi-region IP (US/EU/JP/BR…) coverage, building multi-language, multi-region video datasets.
Dedicated 1Gbps to 200Gbps+ (Customizable)
Fixed pricing by bandwidth, not by traffic, predictable cost
123Proxy provides high bandwidth proxy pool service specifically for AI training data
collection: fixed bandwidth billing (1Gbps–100Gbps+), unlimited total traffic, unlimited
concurrent requests.
Target site bot automatic monitoring, ensuring target website is not blocked
Ultra-high cost performance for large scale data scraping.
# Example: Download YouTube video using 123Proxy High Bandwidth Proxy IP
yt-dlp \
--proxy "http://USERNAME_sessionId_time:PASSWORD@ytbproxy.123proxy.cn:35765" \
"https://www.youtube.com/watch?v=VIDEO_ID"
Assign different session IDs (automatic IP rotation) for each task. By appending _sessionId after the username, you can bind different Sessions to each download task, thereby using different IPs.
Solution:
- Enable 123Proxy high bandwidth dedicated proxy, it can automatically avoid such bot
errors.
For large video files, it is recommended to use Sticky
Session:
In 123Proxy, use a username with Session ID (e.g. `USERNAME_sessA`), keep it unchanged
during video download,
can try to keep the same exit IP, reducing connection resets.
- 1Gbps: Theoretically about 125MB/s in one direction, suitable for
hundreds of concurrent downloads.
- 10Gbps: Theoretically about 1.25GB/s, can support hundreds to thousands of tasks.
- 100Gbps: Suitable for large teams collecting PB-level data continuously for a long time.
- Enable yt-dlp automatic retry: `--retries 10 --fragment-retries
20`
- Appropriately reduce single connection speed, increase concurrent tasks to improve overall
throughput.
- If such problems persist, contact 123Proxy technical support to check line quality.
Using proxy itself is legal, but data scraping on platforms like YouTube should strictly comply with their terms of service and local laws and regulations.