Skip to content

test: add fast unit tests for LFS pointer detection#1945

Open
zhangyinxina-ui wants to merge 4 commits intodimensionalOS:devfrom
zhangyinxina-ui:test/lfs-pointer-detection-tests
Open

test: add fast unit tests for LFS pointer detection#1945
zhangyinxina-ui wants to merge 4 commits intodimensionalOS:devfrom
zhangyinxina-ui:test/lfs-pointer-detection-tests

Conversation

@zhangyinxina-ui
Copy link
Copy Markdown

Description

  • Add dimos/utils/test_data_lfs_pointer.py with five fast pytest cases for _is_lfs_pointer_file.
  • Covers valid LFS pointer header, oversized non-pointer, plain text, missing path, and invalid UTF-8 small files.
  • No network or git lfs pull; keeps the default fast test suite meaningful for data-layer helpers.

How to Test

bash scripts/verify.sh

Optional single file:

uv run pytest dimos/utils/test_data_lfs_pointer.py -q

Contributor License Agreement

  • I have read and approved the CLA.

Closes DIM-XXX

zhangyinxina-ui and others added 4 commits April 28, 2026 20:01
…ipeline CI) (#4)

Strictly additive scaffolding that lets `/ship <one-liner>` in Cursor chat
drive: branch off origin/dev -> implement -> bash scripts/verify.sh -> PR
targeting dev -> Codex review -> squash auto-merge.

- scripts/verify.sh: fast+mypy single source of truth; auto-skips cuda
  and unitree-dds extras on macOS arm64 where wheels/build deps are
  unavailable. Slow tier and self-hosted CI remain authoritative.
- AGENTS.md: append-only Agent Pipeline section + P0/P1/P2/P3 Codex
  review guidelines tailored to dimos (secrets, robot motion bounds,
  mypy strict, @Skill invariants, transport renames, all_blueprints
  regen). Existing content untouched.
- .cursor/rules/00-workflow.mdc: alwaysApply rules forbidding direct
  push to dev/main, force push, --no-verify, neutering verify.sh, and
  encoding dimos-specific guardrails.
- .cursor/commands/ship.md: canonical 11-step /ship flow, base=dev,
  uses gh REST API to dodge GraphQL race after push.
- .github/workflows/agent-pipeline.yml: ubuntu-latest smoke job
  (verify-fast) -> ruff + pytest --collect-only + mypy. Does NOT
  replace ci.yml (Linux self-hosted) or macos.yml (macOS self-hosted).

Local preflight: mypy dimos/ -> Success: no issues found in 650 source
files. pytest left for CI (LFS data download too slow on first local
run on this machine).

Co-authored-by: update <zhang@local>
)

The lightweight `verify-fast` job on `ubuntu-latest` failed during
`uv sync` because `pyaudio==0.2.14` (transitively pulled in by
`unitree-webrtc-connect-leshy` from the `dimos[unitree]` extras) needs
`portaudio.h` to compile its native extension. ubuntu-latest does not
ship portaudio headers by default.

Mirror what `macos.yml` does (`brew install portaudio libjpeg-turbo`)
on the apt side: install `portaudio19-dev`, `libjpeg-turbo8-dev`, and
`libturbojpeg0-dev` before `uv sync`.

This is also the demo PR validating the agent pipeline end-to-end:
branch off origin/dev -> fix -> commit -> push -> open PR -> verify-fast
green -> squash merge.
…ages (#6)

Adds a callout to the "Adding a New Skill" section in AGENTS.md that
spells out the existing package convention so contributors do not
mistakenly create __init__.py files when adding skills/modules.

The convention: every subpackage under dimos/ (including
dimos/agents/skills/, dimos/core/, dimos/perception/, etc.) is a
PEP 420 namespace package. The only __init__.py in the dimos/ tree is
dimos/__init__.py itself, which exists solely as a lazy re-export shim
for the top-level Dimos class.

Adding __init__.py files would silently change import resolution for
pytest collection, mypy, and the all_blueprints.py regen tooling that
walk the dimos/ tree.

Docs-only change — does not touch Python sources, test fixtures, or
CI logic. agent-pipeline.yml's paths-ignore correctly skips verify-fast
for **.md changes.

Co-authored-by: update <zhang@local>
Cover _is_lfs_pointer_file with tmp_path fixtures: valid pointer, oversized
file, plain text, missing path, and invalid UTF-8. No network or LFS pull.

Made-with: Cursor
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 30, 2026

Greptile Summary

This PR adds five fast pytest unit tests for _is_lfs_pointer_file in dimos/utils/test_data_lfs_pointer.py, along with a new scripts/verify.sh local verification script, a agent-pipeline.yml GitHub Actions workflow for fast CI smoke checks, and Cursor agent configuration files (.cursor/commands/ship.md, .cursor/rules/00-workflow.mdc) plus AGENTS.md additions for the agent pipeline workflow.

The test logic itself is correct and covers all meaningful branches of _is_lfs_pointer_file. The P2 findings are in the supporting infrastructure: the new CI workflow never actually executes the test suite (only collects), and verify.sh passes -q to pytest while addopts already specifies -v.

Confidence Score: 4/5

Safe to merge; P2 findings only — the tests themselves are correct and the existing ci.yml still executes the suite.

All findings are P2 (style/design): the new CI workflow is a smoke check by design and existing ci.yml provides actual test execution coverage. The unit tests are logically sound.

.github/workflows/agent-pipeline.yml — does not run tests, only collects; consider whether verify.sh should be executed rather than just syntax-checked.

Important Files Changed

Filename Overview
dimos/utils/test_data_lfs_pointer.py Five focused unit tests for _is_lfs_pointer_file covering all key branches (valid pointer, oversized file, plain text, missing file, invalid UTF-8); logic and assertions are correct.
.github/workflows/agent-pipeline.yml New fast CI workflow runs pytest --collect-only (no test execution) and only syntax-checks verify.sh via bash -n, so it never actually runs the test suite; also contradicts the "CI invokes verify.sh" claim in AGENTS.md.
scripts/verify.sh New local verification script; -q flag passed to pytest conflicts with -v in addopts in pyproject.toml, causing confusing output (though execution is unaffected).
AGENTS.md Adds Agent Pipeline section and Codex review guidelines; the claim that "Both local agents and CI invoke this exact script [verify.sh]" is inaccurate for the new agent-pipeline.yml workflow.
.cursor/commands/ship.md New Cursor agent command defining an 11-step PR shipping workflow; documentation only, no functional issues.
.cursor/rules/00-workflow.mdc New alwaysApply Cursor rule file enforcing repo workflow conventions; documentation only, no functional issues.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[PR opened / synchronized] --> B{agent-pipeline.yml}
    A --> C{ci.yml - self-hosted Linux}

    B --> D[uv sync]
    D --> E[ruff check/format - non-blocking]
    E --> F[pytest --collect-only\nvalidates imports only]
    F --> G[mypy dimos/]
    G --> H[bash -n verify.sh\nsyntax check only]

    C --> I[uv sync]
    I --> J[pytest - ACTUALLY RUNS TESTS\nnot tool or mujoco markers]
    J --> K[mypy dimos/]

    H --> L[agent-pipeline: PASS\nno tests executed]
    K --> M[ci.yml: PASS\ntests executed]

    style F fill:#f9a,stroke:#c00
    style H fill:#f9a,stroke:#c00
    style J fill:#9f9,stroke:#090
    style L fill:#f9a,stroke:#c00
    style M fill:#9f9,stroke:#090
Loading

Reviews (1): Last reviewed commit: "test: add fast unit tests for LFS pointe..." | Re-trigger Greptile

Comment on lines +78 to +86
# Ruff issues are surfaced via annotations but do not block this
# smoke job; main `ci.yml` is the authoritative formatter gate.

- name: pytest --collect-only (catch import errors fast)
run: |
uv run pytest --collect-only -q -m 'not (tool or mujoco or slow)' dimos/ \
> /tmp/pytest-collect.txt 2>&1
tail -20 /tmp/pytest-collect.txt
# If imports break, this fails in seconds rather than 10 min into
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 CI pipeline never actually executes the test suite

The pytest step only runs --collect-only (validates imports and collection, not test logic), and verify.sh is only checked for shell syntax via bash -n. If a test in dimos/utils/test_data_lfs_pointer.py (or any other test) contains a logic error that makes it fail at runtime, this workflow will still pass. The actual test execution happens in ci.yml (self-hosted Linux runner), so tests aren't skipped entirely — but this workflow's value as a "fast second-opinion CI signal" for test correctness is limited since it never runs tests.

Additionally, AGENTS.md and verify.sh both state "Both local developers and CI invoke this exact script", but agent-pipeline.yml never calls verify.sh as an executable.

Consider replacing the collect-only + syntax-check steps with an actual execution of verify.sh (or at minimum uv run pytest -q) to make this workflow meaningful as a test gate:

- name: Run fast test suite
  run: |
    uv run pytest -q -m 'not (tool or mujoco or slow)' dimos/

Comment thread scripts/verify.sh
echo ""
echo ">>> [2/3] uv run pytest -q (fast suite: excludes slow/tool/mujoco)"
uv run pytest -q
echo "<<< OK"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 -q conflicts with -v in addopts

pyproject.toml sets addopts = "-s -v -r a ...", which includes -v (verbose). Passing -q on the command line decrements the verbosity level, so the two flags cancel each other out to default verbosity — not the quiet output the -q flag implies. The test suite still runs correctly, but the output format will be neither quiet nor verbose, which can make CI logs confusing.

Suggested change
echo "<<< OK"
uv run pytest

Omitting -q lets addopts control verbosity as configured.

Suggested change
echo "<<< OK"
uv run pytest

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0682ec1fda

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/verify.sh
Darwin) UV_EXTRA_FLAGS+=(--no-extra cuda --no-extra unitree-dds) ;;
esac
echo "uv sync ${UV_EXTRA_FLAGS[*]}"
uv sync "${UV_EXTRA_FLAGS[@]}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Run verify sync in frozen/locked mode

verify.sh is used as the local gate, but uv sync is invoked without --frozen/--locked, so it can silently update uv.lock during verification and pass with a dependency set that isn’t what CI will use. uv sync --help explicitly says --frozen is "Sync without updating the uv.lock file"; without it, stale lockfiles are auto-updated instead of failing fast. In practice this can produce false-green local verification and then CI failures when the lockfile change wasn’t committed.

Useful? React with 👍 / 👎.

@paul-nechifor
Copy link
Copy Markdown
Contributor

paul-nechifor commented Apr 30, 2026

Thank you for the contribution. But can you please break it up into separate PRs? You're making a lot of agent changes, including ones which imply a certain process. Those need to be discussed with the team and motivated. Also, please respect the PR description template which requires describing the problem each PR is address.

There should be at least 2 PRs: one for the LFS changes and one for the agent changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants