feat(settings): make token-by-token streaming reveal opt-in (default off) by FuJacob · Pull Request #692 · FuJacob/cotabby

FuJacob · 2026-06-12T05:08:08Z

Summary

Suggestions were appearing token-by-token (read as "character by character") because PR #687 streams ghost text live as the model decodes. This adds a "Stream Suggestions While Generating" toggle (Appearance → Display), defaulting off, so suggestions appear once, fully formed, after generation finishes. Power users can opt back into the live streaming reveal.

The gate is at the prediction dispatch: when streaming is off, no onPartial handler is passed to the engine, so the engine skips its per-token main-actor hops entirely and the suggestion is presented once through apply(). When on, the existing streamed-partial behavior (each partial rendered as an acceptable session you can Tab into early) is preserved unchanged.

Validation

xcodebuild -project Cotabby.xcodeproj -scheme Cotabby -destination 'platform=macOS' build -derivedDataPath build/DerivedData
# ** BUILD SUCCEEDED **

xcodebuild ... test -only-testing:CotabbyTests/SettingsIndexTests \
  -only-testing:CotabbyTests/SuggestionSettingsStoreTests \
  -only-testing:CotabbyTests/SuggestionSettingsModelTests \
  -only-testing:CotabbyTests/SuggestionCoordinatorPredictionTests \
  -only-testing:CotabbyTests/StreamedGhostTextPolicyTests \
  -only-testing:CotabbyTests/LlamaSuggestionEngineStreamingTests \
  CODE_SIGNING_ALLOWED=NO CODE_SIGNING_REQUIRED=NO
# ** TEST SUCCEEDED **  (all suites, 0 failures)

swiftlint lint --quiet   # exit 0

UI: new toggle in Appearance → Display, under "Suggestion Display".

Linked issues

Risk / rollout notes

New persisted setting: UserDefaults key cotabbyStreamSuggestionsWhileGenerating, defaults to false. This is a behavior change vs. current main (perf(stream): render ghost text while the model is still decoding #687 streamed by default) — that's the intent: revert the default to all-at-once and make streaming opt-in.
Threaded through the full settings stack (data → store load/write-back/save → model @Published/setter/snapshot/Combine publisher → snapshot struct → Appearance UI toggle → search index). The new Combine upstream is grouped into the existing acceptance-toggle slot (CombineLatest3) to stay under Combine's four-input cap.
The streaming code path (queueStreamedPartial / applyStreamedPartial / StreamedGhostTextPolicy) is untouched and fully exercised when the toggle is on; the gate only decides whether onPartial is wired up.
No project.yml/pbxproj changes.

Greptile Summary

This PR makes token-by-token suggestion streaming opt-in (defaulting off), reversing the behavior introduced in #687. A new "Stream Suggestions While Generating" toggle is added to Appearance → Display, and the gate is implemented by conditionally omitting onPartial from the engine call.

Settings stack: streamSuggestionsWhileGenerating is threaded through all layers — SuggestionSettingsData, SuggestionSettingsStore (UserDefaults key cotabbyStreamSuggestionsWhileGenerating, default false), SuggestionSettingsModel (@Published + setter + Combine publisher via CombineLatest3), and SuggestionSettingsSnapshot.
Dispatch gate: dispatchGeneration reads the flag from the snapshot on the main actor before creating the work closure, capturing a plain Bool so the streaming decision is stable for the duration of a single generation.
UI & search: New toggle in the Display section with accurate "token-by-token" copy; SettingsIndex case with comprehensive search keywords for discoverability.

Confidence Score: 5/5

Safe to merge. The change is purely additive — a new opt-in toggle that defaults off — and the existing streaming code path is untouched.

The gate is read once on the main actor before a generation is dispatched, so the streaming decision is stable for the duration of any single generation and cannot change mid-flight. The new setting travels through every layer (data, store, model, snapshot, UI) in a pattern identical to adjacent settings. CombineLatest3 correctly replaces the prior CombineLatest without exceeding Combine's four-input cap, and the test fixture default matches the production default of false.

No files require special attention.

Important Files Changed

Filename	Overview
Cotabby/App/Coordinators/SuggestionCoordinator+Prediction.swift	Adds a `shouldStreamPartials` Bool captured from the snapshot on the main actor before dispatching work; wires it into the `onPartial` handler decision. Clean gate that preserves the streaming path fully when on.
Cotabby/Models/SuggestionEngineModels.swift	Adds `streamSuggestionsWhileGenerating: Bool` to `SuggestionSettingsSnapshot`. Well-documented and consistent with adjacent fields.
Cotabby/Models/SuggestionSettingsData.swift	Adds `streamSuggestionsWhileGenerating: Bool` to the durable data value type. Single initialization site updated in `SuggestionSettingsStore`; no default needed.
Cotabby/Models/SuggestionSettingsModel.swift	Threads the new `@Published` property through init, snapshot construction, setter, and Combine publisher (replacing `CombineLatest` with `CombineLatest3` to stay within the four-input cap). Correct and consistent with the existing pattern.
Cotabby/Support/SuggestionSettingsStore.swift	Adds `streamWhileGeneratingDefaultsKey`, load/resolve, and save logic. UserDefaults key `cotabbyStreamSuggestionsWhileGenerating` defaults to false. Consistent with adjacent settings.
Cotabby/UI/Settings/Panes/AppearancePaneView.swift	Adds `streamWhileGeneratingBinding` and its Toggle into the Display section. Description text says "token-by-token" which is accurate.
Cotabby/UI/Settings/SettingsIndex.swift	Adds `streamWhileGenerating` case with correct pane routing, icon, and search keywords. Thorough keyword set for discoverability.
CotabbyTests/CotabbyTestFixtures.swift	Adds `streamSuggestionsWhileGenerating: Bool = false` default parameter to the snapshot fixture factory. Default matches production default.

Sequence Diagram

sequenceDiagram
    participant User
    participant AppearancePaneView
    participant SuggestionSettingsModel
    participant SuggestionSettingsStore
    participant SuggestionCoordinator
    participant SuggestionEngine

    User->>AppearancePaneView: Toggle "Stream Suggestions While Generating"
    AppearancePaneView->>SuggestionSettingsModel: setStreamSuggestionsWhileGenerating(enabled)
    SuggestionSettingsModel->>SuggestionSettingsStore: saveStreamSuggestionsWhileGenerating(enabled)
    SuggestionSettingsStore->>SuggestionSettingsStore: userDefaults.set(enabled, forKey:)
    SuggestionSettingsModel->>SuggestionSettingsModel: Combine publisher fires → snapshot updated

    Note over SuggestionCoordinator: On next keystroke / focus event
    SuggestionCoordinator->>SuggestionCoordinator: dispatchGeneration() reads settingsSnapshot.streamSuggestionsWhileGenerating

    alt "streamSuggestionsWhileGenerating == true"
        SuggestionCoordinator->>SuggestionEngine: generateSuggestion(onPartial: queueStreamedPartial)
        SuggestionEngine-->>SuggestionCoordinator: onPartial(partial) per token
        SuggestionCoordinator->>SuggestionCoordinator: queueStreamedPartial → render ghost text live
        SuggestionEngine-->>SuggestionCoordinator: final result
        SuggestionCoordinator->>SuggestionCoordinator: apply(result)
    else "streamSuggestionsWhileGenerating == false (default)"
        SuggestionCoordinator->>SuggestionEngine: generateSuggestion(onPartial: nil)
        SuggestionEngine-->>SuggestionCoordinator: final result (no per-token hops)
        SuggestionCoordinator->>SuggestionCoordinator: apply(result) — one-shot reveal
    end

_{Reviews (2): Last reviewed commit: "docs(settings): describe streaming revea..." | Re-trigger Greptile}

…off) PR #687 added streaming ghost text that reveals a suggestion token-by-token as the model decodes. Some users read the incremental reveal as the suggestion "coming out character by character" and prefer it to appear once, fully formed. Add a "Stream Suggestions While Generating" toggle (Appearance > Display), defaulting off. When off, the prediction path passes no onPartial handler, so the engine skips its per-token main-actor hops and the suggestion is presented once through apply(). When on, the existing streamed-partial behavior (each partial an acceptable session you can Tab into early) is preserved.

…-by-word Address Greptile P2: LLM decoding is token-by-token (sub-word fragments), so the toggle copy should not promise word granularity.

greptile-apps Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread Cotabby/UI/Settings/Panes/AppearancePaneView.swift Outdated

docs(settings): describe streaming reveal as token-by-token, not word…

dfeaf0b

…-by-word Address Greptile P2: LLM decoding is token-by-token (sub-word fragments), so the toggle copy should not promise word granularity.

FuJacob merged commit 646ad6e into main Jun 12, 2026
4 checks passed

FuJacob deleted the fix/suggestion-streaming-char-by-char branch June 12, 2026 05:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(settings): make token-by-token streaming reveal opt-in (default off)#692

feat(settings): make token-by-token streaming reveal opt-in (default off)#692
FuJacob merged 2 commits into
mainfrom
fix/suggestion-streaming-char-by-char

FuJacob commented Jun 12, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

FuJacob commented Jun 12, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Linked issues

Risk / rollout notes

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FuJacob commented Jun 12, 2026 •

edited by greptile-apps Bot

Loading