Skip to content

Contributing

Guidelines for contributing to Data Miner.


Development Setup

# Clone and install in dev mode
git clone https://github.com/MVPavan/data_miner.git
cd data_miner
pip install -e ".[dev]"

# Start PostgreSQL
docker compose up -d

# Initialize database
data-miner init-db

Project Structure

data_miner/
├── cli.py              # Click CLI commands
├── config/             # Pydantic configs + OmegaConf
├── db/                 # SQLModel + PostgreSQL
├── workers/            # Long-running workers
├── modules/            # Core processing logic
├── models/             # ML model wrappers
└── utils/              # Utilities

Code Style

  • Python 3.12+ features allowed
  • Type hints required for public functions
  • Docstrings for classes and public methods
  • Black for formatting (line length 100)
  • isort for imports

Adding a New Worker

  1. Create workers/my_worker.py
  2. Extend appropriate base class:
  3. BaseVideoWorker for per-video processing
  4. BaseProjectVideosWorker for per-project-video processing
  5. BaseProjectStageWorker for project-level operations

  6. Implement process() method

  7. Add to supervisor config in cli.py

Testing

# Run tests
pytest

# With coverage
pytest --cov=data_miner

Database Migrations

Currently using SQLModel's create_all(). For schema changes:

  1. Update models in db/models.py
  2. Run data-miner init-db --force (destroys data!)

Submitting Changes

  1. Fork the repository
  2. Create a feature branch
  3. Make changes with tests
  4. Submit a pull request

Reporting Issues

Include: - Python version - PostgreSQL version - Config file (anonymized) - Error logs - Steps to reproduce