Skip to content

Large dataset file upload (≥100 GB) hangs with ERR_CONNECTION_TIMED_OUT and shows "Upload failed. Please retry." #5144

@eugenegujing

Description

@eugenegujing

What happened?

After raising dataset.single_file_upload_max_size_mib to 204800 (200 GiB) via the Admin Settings page so that the frontend size check accepts large files, uploading a multi-100GB .h5 file to a dataset fails.

The dataset page only shows the generic toast: Upload failed. Please retry.

The browser console shows that the multipart-upload requests for many individual parts time out at the TCP level (net::ERR_CONNECTION_TIMED_OUT), e.g.:

  :4200/api/dataset/multipart-upload?...&partNumber=6     net::ERR_CONNECTION_TIMED_OUT
  :4200/api/dataset/multipart-upload?...&partNumber=90    net::ERR_CONNECTION_TIMED_OUT
  :4200/api/dataset/multipart-upload?...&partNumber=115   net::ERR_CONNECTION_TIMED_OUT
  :4200/api/dataset/multipart-upload?...&partNumber=160   net::ERR_CONNECTION_TIMED_OUT

How to reproduce?

  1. Run Texera locally (frontend on :4200, file-service backend).
  2. As an admin, open Admin → Settings, and set:
    - Max File Size (MiB): 204800 (i.e. 200 GiB)
    - Part Size (MiB): 50 (default)
    Save.
  3. Create a new dataset, open the dataset page, click "Browser & Upload Files" (or drag-and-drop).
  4. Select a single ≥100 GB file (in my case a 179 GB .h5 file).
  5. Observe:
    • Browser DevTools → Network shows many requests to /api/dataset/multipart-upload?...&partNumber=N failing with net::ERR_CONNECTION_TIMED_OUT.
    • The dataset page shows only the toast: Upload failed. Please retry.
    • No new version is created.

Branch

main

Commit Hash (Optional)

No response

What browsers are you seeing the problem on?

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions