What feature would you like to see?
A way to truly cancel an in-flight sendMessageStream / generateContentStream call so that the model stops generating on the server side β not just on the client.
Today, cancelling the StreamSubscription only stops the client from reading new chunks. The HTTPS connection stays open and the model keeps generating (and billing) tokens up to maxOutputTokens. That makes a real "Stop generating" button impossible in chat UIs: pressing Stop freezes the UI but doesn't actually save anything server-side.
Why this matters
For a chat surface with tool-call chains (Gemini Flash, maxOutputTokens: 8192, recursive tool dispatcher), a single runaway prompt can burn 100k+ output tokens server-side after the user has visually "stopped" the chat. At scale that's real cost, and product-wise it means apps can't honestly ship a Stop button.
Proposed minimal change
The root cause is that package:http itself doesn't yet support cancellation (dart-lang/http#204) β that one is on the wider Dart team and will take time. But firebase_ai's internal Client already accepts an http.Client? (client.dart:45), and so do createGenerativeModel and the internal GenerativeModel ctor. The public factory just doesn't pass it through:
// firebase_ai.dart
GenerativeModel generativeModel({
required String model,
// ... existing params ...
http.Client? httpClient, // π add this
}) =>
createGenerativeModel(
// ... existing args ...
httpClient: httpClient,
);
That one-line plumbing change immediately unblocks the workaround: callers can inject a cancellable wrapper like cancellation_token_http and call .close() on cancel. Closing the socket signals Gemini to stop generating, so server-side billing actually stops.
No breaking changes; existing callers see no difference.
Workarounds considered
- Forking firebase_ai locally just to add the param β works but adds maintenance burden.
- Using
package:google_generative_ai β incompatible with Firebase App Check + Firebase auth.
- Calling Vertex AI REST directly β loses tool-call schema mapping, function-call parsing, App Check token injection.
Environment
firebase_ai: 3.11.0
- Flutter 3.x (web + iOS + Android)
- Gemini 2.5 Flash via
FirebaseAI.googleAI()
BTW, you rock π€π₯πͺ½
What feature would you like to see?
A way to truly cancel an in-flight
sendMessageStream/generateContentStreamcall so that the model stops generating on the server side β not just on the client.Today, cancelling the
StreamSubscriptiononly stops the client from reading new chunks. The HTTPS connection stays open and the model keeps generating (and billing) tokens up tomaxOutputTokens. That makes a real "Stop generating" button impossible in chat UIs: pressing Stop freezes the UI but doesn't actually save anything server-side.Why this matters
For a chat surface with tool-call chains (Gemini Flash,
maxOutputTokens: 8192, recursive tool dispatcher), a single runaway prompt can burn 100k+ output tokens server-side after the user has visually "stopped" the chat. At scale that's real cost, and product-wise it means apps can't honestly ship a Stop button.Proposed minimal change
The root cause is that
package:httpitself doesn't yet support cancellation (dart-lang/http#204) β that one is on the wider Dart team and will take time. But firebase_ai's internalClientalready accepts anhttp.Client?(client.dart:45), and so docreateGenerativeModeland the internalGenerativeModelctor. The public factory just doesn't pass it through:That one-line plumbing change immediately unblocks the workaround: callers can inject a cancellable wrapper like
cancellation_token_httpand call.close()on cancel. Closing the socket signals Gemini to stop generating, so server-side billing actually stops.No breaking changes; existing callers see no difference.
Workarounds considered
package:google_generative_aiβ incompatible with Firebase App Check + Firebase auth.Environment
firebase_ai: 3.11.0FirebaseAI.googleAI()BTW, you rock π€π₯πͺ½