CLI Reference¶
All commands are available via the data-miner CLI.
Core Commands¶
init-db¶
Initialize database tables.
populate¶
Add videos to the database from config sources (search queries, URLs, files).
# Use config file
data-miner populate --config config.yaml
# Dry run (show what would be added)
data-miner populate --config config.yaml --dry-run
add-video¶
Add a single video URL.
Options:
- --project - Project name (default: from config)
- --source-type - url, search, or file
- --source-info - Additional metadata
status¶
Show pipeline status.
Worker Management¶
workers setup¶
Generate supervisor configuration.
This creates /etc/supervisor/conf.d/data_miner.conf with worker definitions.
workers start¶
Start all workers.
workers stop¶
Stop all workers.
workers restart¶
Restart all workers.
workers status¶
Show supervisor worker status.
Maintenance Commands¶
delete-project¶
Delete a project and optionally its files.
# Delete project (keep files)
data-miner delete-project my_project
# Delete project and files
data-miner delete-project my_project --files
# Also delete orphaned videos
data-miner delete-project my_project --files --orphans
# Skip confirmation
data-miner delete-project my_project --yes
delete-videos¶
Delete project-videos with optional filters.
# Delete all FAILED videos
data-miner delete-videos --project my_project --pv-status FAILED
# Delete videos and files
data-miner delete-videos --project my_project --pv-status FAILED --files
cleanup-orphans¶
Remove orphaned videos not linked to any project.
force-dedup¶
Force project back to DEDUP_READY stage (re-run cross-dedup).
force-detect¶
Force project back to DETECT_READY stage (re-run detection).
Environment Variables¶
| Variable | Description |
|---|---|
DATA_MINER_CONFIG |
Path to config file |
DATABASE_URL |
PostgreSQL connection string |
HF_TOKEN |
HuggingFace token for private models |
DATA_MINER_DEBUG |
Set to 1 to disable heartbeat (dev only) |
Next Steps¶
- Quickstart - End-to-end tutorial
- Configuration - Config options