Skip to content

ci: reusable-claude.yml workflow_call startup_failure since ~2026-05-08 — @claude bot non-functional in platform-api & ums-web #8

@cohenrobinson

Description

@cohenrobinson

Summary

The on-demand "Claude Code" workflow (each repo's .github/workflows/claude.ymlUtilified/.github/.github/workflows/reusable-claude.yml@main) is hitting startup_failure on issues / issue_comment / pull_request_review / pull_request_review_comment events across the consuming repos. A startup_failure occurs before the job if: is evaluated, so this is not the normal non-@claude skip (those correctly show skipped).

Evidence it's a real regression, not skip-noise

  • Normal @claude-guard non-invocations show skipped, and there are genuine historical skipped/success runs — so the guard, secret, and action ref are fundamentally sound.
  • ums-portal flipped from skipped → 100% startup_failure on 2026-05-08 with no caller change → a startup-time regression in this repo's shared reusable, most likely the workflow_call job's permissions: / secret-inheritance contract (possibly triggered by an org runner-policy change, e.g. an added step-security/harden-runner).
  • CLAUDE_CODE_OAUTH_TOKEN is present in all 5 repos and the action ref resolves — ruling those out.

Impact

  • platform-api and ums-web are now 100% startup_failure — the on-demand @claude assistant is effectively dead in the two primary repos; the others are intermittently broken.
  • This is the on-demand @claude workflow — distinct from the automatic claude-review PR-review check.

Suggested fix (one place fixes all repos)

  • Reconcile the reusable claude job's permissions: block with what callers grant — a workflow_call job cannot request permissions the caller didn't grant, which surfaces as startup_failure.
  • Consider moving the @claude if: guard up to the caller job level so non-invocations skip cleanly instead of starting the reusable.
  • Diff Utilified/.github history around the 2026-05-08 onset (last content edit 2026-05-19; the trigger may be an org policy / runner-policy change rather than a file edit).

Acceptance criteria

  • A non-@claude event shows skipped (not startup_failure).
  • An @claude-mention comment in platform-api and ums-web successfully starts and runs the workflow.
  • No startup_failure Claude Code runs across the consuming repos for a week.

Context

Found during CI-health review (2026-06-06). The consuming claude.yml is byte-identical across platform-api, ums-web, ums-portal, platform-infra, platform-mcp.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions