Enable fetching of "obs4ref" data from pmp#219
Conversation
* main: Bump version: 0.3.0 → 0.3.1 chore: Fix missing changelog entry docs: changelog fix: Disable installation check until #217 is resolved Bump version: 0.2.0 → 0.3.0 fix: Hotfix integration tests chore: Fix incorrect import docs: changelog docs: typo docs: update getting started docs: add docs for metric execution groups refactor: fix tests refactor: unconfuse definitions of MetricExecutions vs. MetricExecutionGroups fix: migration sets all existing executions to not be retracted refactor: rename MetricExecution to MetricExecutionGroup
There was a problem hiding this comment.
Pull Request Overview
This PR enables the fetching of "obs4ref" data from PMP by incorporating the dataset into the ingestion workflow and extending the CLI. Key changes include:
- Adding an ingestion call for obs4ref data in integration tests and updating the expected providers.
- Introducing a new CLI command to fetch obs4ref data and updating related logging and testing routines.
- Removing the legacy PMP registry module.
Reviewed Changes
Copilot reviewed 9 out of 11 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/integration/test_ar7_ft.py | Added ingestion for obs4ref data and updated expected metrics providers. |
| stubs/pooch/init.pyi | Added a stub for get_logger to support updated logging handling. |
| packages/ref/src/cmip_ref/testing.py | Updated the sample data fetch implementation using new registry helper. |
| packages/ref/src/cmip_ref/dataset_registry/init.py | Introduced the new reference data registry for non-published datasets. |
| packages/ref/src/cmip_ref/cli/datasets.py | Added a CLI command to fetch obs4ref data along with improved documentation. |
| packages/ref/src/cmip_ref/cli/_logging.py | Adjusted logging setup to clear interfering handlers from pooch logger. |
| packages/ref-metrics-pmp/src/cmip_ref_metrics_pmp/registry/init.py | Removed outdated PMP registry module. |
| docs/getting_started.md | Fixed a spelling typo in the documentation. |
| .github/workflows/ci-integration.yaml | Expanded CI to run the new fetch-obs4ref-data command. |
Files not reviewed (2)
- packages/ref/src/cmip_ref/dataset_registry/pmp_reference.txt: Language not supported
- tests/test-data/.gitignore: Language not supported
Comments suppressed due to low confidence (2)
tests/integration/test_ar7_ft.py:67
- [nitpick] The use of 'obs4mips' as the source-type for the obs4ref data may be ambiguous. Consider aligning the source-type with the dataset name (e.g. 'obs4ref') if the ingestion logic permits to improve clarity.
invoke_cli(["datasets", "ingest", "--source-type", "obs4mips", str(sample_data_dir / ".." / "obs4ref")])
packages/ref-metrics-pmp/src/cmip_ref_metrics_pmp/registry/init.py:1
- The complete removal of the PMP registry module appears intentional; please verify that all dependencies on this module are updated to avoid broken references.
Entire file removed
Codecov ReportAll modified and coverable lines are covered by tests ✅ see 2 files with indirect coverage changes 🚀 New features to boost your workflow:
|
|
I'm going to pull out the changes that add updates to the integration test as that is currently broken because the PMP output fails to parse. I'll make a follow PR to demonstrate that so @lee1043 can resolve, but that PR will need updated sample data (Climate-REF/ref-sample-data#24), which in turn depends on these changes ♻️ |
Description
PMP requires some datasets that have been preprocessed by PCMDI into an obs4mips.
Both datasets have been checked by @dhegedus99 so we now have the appropriate licences to use these data.
Checklist
Please confirm that this pull request has done the following:
changelog/