Skip to content

Conversation

@mitali401
Copy link
Contributor

Emit user warning for disabled_prompt_cache or --no-prompt-cache params

poetry run together endpoints create \
  --model "meta-llama/Llama-3-8b-chat-hf" \
  --gpu h100 \
  --gpu-count 1 \
  --min-replicas 1 \
  --max-replicas 1 \
  --no-prompt-cache \
  --no-speculative-decoding
/Users/MMeratwal/Library/Application Support/pypoetry/venv/lib/python3.9/site-packages/urllib3/__init__.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
  warnings.warn(
/Users/MMeratwal/Desktop/together-python/src/together/cli/api/endpoints.py:175: UserWarning: The 'disable_prompt_cache' parameter (CLI flag: '--no-prompt-cache') is deprecated and will be removed in a future version.
  response = client.endpoints.create(
Created dedicated endpoint with:
  Model: meta-llama/Llama-3-8b-chat-hf
  Min replicas: 1
  Max replicas: 1
  Hardware: 1x_nvidia_h100_80gb_sxm
  Prompt cache: disabled
  Speculative decoding: disabled
Endpoint created successfully, id: endpoint-8b513273-1545-4c43-9843-502e2b3f1ebf
Waiting for endpoint to be ready...

closes: https://linear.app/together-ai/issue/MLE-2917/emit-user-warning-for-using-prompt-cache-param

@mitali401 mitali401 merged commit 876f1ca into main Jan 20, 2026
11 checks passed
@mitali401 mitali401 deleted the mitali/deprecate-prompt-cache-sdk branch January 20, 2026 22:13
Copy link
Contributor

@atihkin atihkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants