Skip to content

brev create returns 'unexpected EOF' rc=1 even after the workspace was actually created #382

@robobryce

Description

@robobryce

What

brev create <name> --type <T> sometimes exits with non-zero return code and a stderr body like:

WARN  RESTY Post "https://brevapi.us-west-2-prod.control-plane.brev.dev/api/organizations/<org-id>/workspaces?cli_version=v0.6.323&local=true&os=linux&utm_source=cli": unexpected EOF, Attempt 1
ERROR RESTY Post "https://brevapi.us-west-2-prod.control-plane.brev.dev/api/organizations/<org-id>/workspaces?cli_version=v0.6.323&local=true&os=linux&utm_source=cli": unexpected EOF
[Worker 1] m8i-flex.2xlarge Failed: ... Post "...": unexpected EOF
Warning: Only created 0/1 instances
could only create 0/1 instances

…even though the workspace was actually created — it shows up in brev ls --json immediately after, with status=DEPLOYING then status=RUNNING after a few minutes, and is fully usable. The non-zero exit is therefore a false negative: the API call appears to have completed server-side but the client got an EOF before reading the success response.

When it happens

Observed multiple times across our automation today, on m8i-flex.2xlarge (AWS) and previously on n2d-standard-8 (GCP). Intermittent — probably 1-in-5 or so during peak hours. Repro is unfortunately just "run brev create enough times".

Why it matters for automation

Downstream tooling that treats brev create's exit code as truth concludes "create failed" and either:

  • Tries to clean up by calling brev delete <name> (which then succeeds, terminating the perfectly-good workspace).
  • Marks an internal record (DB row, etc.) as destroyed/failed, leaving the orphan VM in Brev that no automation will reconcile.
  • Bails the whole pipeline before whatever was supposed to run on the new workspace.

Our workaround (in our deploy script and in the manager's BrevDriver):

async def _workspace_was_created(self, instance_name: str) -> bool:
    try:
        return (await self.lookup_env_id(instance_name)) is not None
    except BrevError:
        return False

# After any non-zero `brev create` exit, call _workspace_was_created.
# If the workspace is in `brev ls --json`, treat the create as a success
# (log a warning) and proceed. If it's not, surface the original error.

What would fix this on the CLI side

Pick whichever fits the architecture best:

  1. Server-side write idempotency + client retry. If POST /workspaces is keyed by a client-supplied request ID, the CLI can retry on EOF without risking a duplicate workspace. Today's behaviour suggests the write is succeeding before the response is fully returned, so a retry would either re-fetch the result (if the API is idempotent) or surface the actual created object.
  2. CLI fallback on EOF: when the response is truncated mid-payload, the CLI internally calls GET /workspaces?name=<name> to check whether the workspace was actually created, and treats that as the source of truth before failing.
  3. At minimum, a clearer signal in the error message that the EOF might mean "successfully created but response truncated" — so callers know to verify via brev ls rather than assume failure and clean up.

Workaround for anyone hitting this today

Verify via brev ls --json after every non-zero brev create exit before treating it as a hard failure.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions