Skip to content

Latest commit

 

History

History
194 lines (141 loc) · 5.75 KB

File metadata and controls

194 lines (141 loc) · 5.75 KB

Contributing to DataFrameExpectations

Thank you for your interest in contributing to DataFrameExpectations! We welcome contributions from the community, whether it's adding new expectations, fixing bugs, improving documentation, or enhancing the testing framework.

Table of Contents

Getting Started

Before you begin:

  1. Check existing issues and pull requests to avoid duplicates
  2. For major changes, open an issue first to discuss your proposal
  3. Ensure you agree with the Apache 2.0 License

Development Setup

  1. Fork and clone the repository:

    git clone https://github.com/getyourguide/dataframe-expectations.git
    cd dataframe-expectations
  2. Install UV package manager:

    pip install uv
  3. Install development dependencies:

    # This will automatically create a virtual environment
    uv sync --group dev
  4. Activate the virtual environment:

    source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  5. Verify your setup:

    uv run pytest tests/ -n auto --cov=dataframe_expectations
  6. (Optional) Install pre-commit hooks:

    pre-commit install

    This will automatically run checks before each commit.

How to Contribute

Reporting Bugs

Open an issue with a clear description, steps to reproduce, expected vs. actual behavior, and relevant environment details.

Documentation

Fix typos, clarify docs, add examples, or improve the README.

Features

Open an issue first to discuss new features, explain the use case, and consider backward compatibility.

Adding Expectations

See the Adding Expectations Guide for detailed instructions.

Running Tests

# Run all tests with parallelization
uv run pytest tests/ -n auto

# Run with coverage and parallelization
uv run pytest tests/ -n auto --cov=dataframe_expectations

# Run specific test file
uv run pytest tests/test_expectations_suite.py -n auto

# Run tests matching a pattern
uv run pytest tests/ -n auto -k "test_expect_min_rows"

Code Style Guidelines

Python Style

  • Follow PEP 8
  • Use type hints for all function parameters and return values
  • Maximum line length: 120 characters
  • Use meaningful variable and function names

Docstrings

  • Use Google-style docstrings
  • Include parameter descriptions and return types
  • Add usage examples for complex functions

Code Quality

  • Write clear, self-documenting code
  • Add comments for complex logic
  • Keep functions focused and single-purpose
  • Avoid deep nesting (max 3-4 levels)

Testing

  • Maintain or improve test coverage
  • Test expected behavior (happy paths) and error conditions (edge cases)
  • Use descriptive test names

Submitting a Pull Request

  1. Create a branch and make your changes

    git checkout -b feature/your-feature-name
  2. Run tests:

    uv run pytest tests/ -n auto --cov=dataframe_expectations
  3. Commit using Conventional Commits (see Versioning)

    git commit -m "feat: your feature description"
  4. Push and open a PR with a clear description referencing any related issues

Versioning and Commits

This project follows Semantic Versioning and uses Conventional Commits.

Commit Message Format

<type>: <description>

[optional body]

[optional footer]

Commit Types

  • feat: - New feature → MINOR version bump (0.1.0 → 0.2.0)
  • fix: - Bug fix → PATCH version bump (0.1.0 → 0.1.1)
  • feat!: or BREAKING CHANGE: - Breaking change → MAJOR version bump (0.1.0 → 1.0.0)
  • docs: - Documentation changes (no version bump)
  • test: - Test changes (no version bump)
  • chore: - Maintenance tasks (no version bump)
  • refactor: - Code refactoring (no version bump)
  • style: - Code style changes (no version bump)
  • ci: - CI/CD changes (no version bump)

Examples

# Adding a new feature
git commit -m "feat: add expect_column_sum_equals expectation"

# Fixing a bug
git commit -m "fix: correct validation logic in expect_value_greater_than"

# Breaking change
git commit -m "feat!: remove deprecated API methods"

# With body
git commit -m "feat: add tag filtering support

Allow expectations to be filtered by tags at runtime.
This enables selective execution of validation rules."

# Documentation update
git commit -m "docs: update README with new examples"

What Happens Next

When your PR is merged to main:

  1. Release Please automatically creates/updates a Release PR
  2. The Release PR includes version bump and changelog
  3. When the Release PR is merged, a GitHub Release is created
  4. The maintainer manually publishes the package to PyPI

Questions?

If you have questions or need help:

Thank you for contributing! 🎉