Is there a way to stop inference when a client cancels a request?
Use case: When sending a request and then canceling it in the middle of the stream, if I immediately send another request, the response from the second request starts in the middle of the first response.
How can I prevent this from happening?
Is there a way to stop inference when a client cancels a request?
Use case: When sending a request and then canceling it in the middle of the stream, if I immediately send another request, the response from the second request starts in the middle of the first response.
How can I prevent this from happening?