Include text/event-stream header only when stream=True#98
Closed
vladimirivic wants to merge 1 commit intomainfrom
Closed
Include text/event-stream header only when stream=True#98vladimirivic wants to merge 1 commit intomainfrom
vladimirivic wants to merge 1 commit intomainfrom
Conversation
Summary:
We want to use the headers to negotiate content.
Sending this header in every request will cause server to return chunks, even without the stream=True param.
```
llama-stack-client inference chat-completion --message="Hello there"
{"event":{"event_type":"start","delta":"Hello"}}
{"event":{"event_type":"progress","delta":"!"}}
{"event":{"event_type":"progress","delta":" How"}}
{"event":{"event_type":"progress","delta":" are"}}
{"event":{"event_type":"progress","delta":" you"}}
{"event":{"event_type":"progress","delta":" today"}}
```
Test Plan:
```
pip install .
llama-stack-client configure --endpoint={endpoint} --api-key={api-key}
llama-stack-client inference chat-completion --message="Hello there"
ChatCompletionResponse(completion_message=CompletionMessage(content='Hello! How can I assist you today?', role='assistant', stop_reason='end_of_turn', tool_calls=[]), logprobs=None)
```
ashwinb
reviewed
Jan 26, 2025
| timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN, | ||
| ) -> InferenceChatCompletionResponse | Stream[InferenceChatCompletionResponse]: | ||
| extra_headers = {"Accept": "text/event-stream", **(extra_headers or {})} | ||
| if stream is True: |
Contributor
There was a problem hiding this comment.
this should be if stream, but the higher level issue is that this is generated code. We need to make sure we auto-apply this patch always after generation (see stainless_sync.sh) or find another way
Contributor
|
Is this still needed after #108 ? |
Contributor
|
@yanxi0830 yes this is still needed but for a different reason. this header is sent by the client to the server, not the other way round. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
We want to use the headers to negotiate content.
Sending this header in every request will cause server to return chunks, even without the stream=True param.
Test Plan: