Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
a74b9d5
Fix: upload Tauri updater .sig sidecars (tauri-action 0.6.2 rename)
cryptopoly May 1, 2026
8e86c6c
Phase 1 chat uplift: highlighting, search, export, real cancel, effor…
cryptopoly May 1, 2026
959545e
Hide MLX-only catalog variants on non-Apple platforms
cryptopoly May 1, 2026
613e3c9
Fix Windows CUDA detection + post-install runtime probe
cryptopoly May 1, 2026
2a7cdfd
Phase 2.0 chat uplift: prompt-processing feedback + TTFT
cryptopoly May 1, 2026
dd7d20c
Preserve Windows GPU runtime on uninstall + lock extras path
cryptopoly May 1, 2026
f1d4d8a
Phase 2.0.5 watchdogs: prompt-eval timeout + memory gate + runaway gu…
cryptopoly May 1, 2026
dd284c8
Phase 2.0.5 hardening: tok/s floor, repetition guard, panic + thermal…
cryptopoly May 1, 2026
8cd4cd0
Phase 2.1 decompose ChatTab.tsx into ChatSidebar / ChatHeader / ChatT…
cryptopoly May 1, 2026
59894fd
Phase 2.2 full sampler exposure: top_p / top_k / min_p / repeat_penal…
cryptopoly May 1, 2026
90e4fc5
Phase 2.11 model capability declarations + composer auto-gating
cryptopoly May 1, 2026
0793282
Phase 2.12 mid-thread model swap with one-turn override
cryptopoly May 1, 2026
72ab7c4
Hotfix: relax memory-gate ceilings + gate vision capability by engine
cryptopoly May 1, 2026
fbb168a
Hotfix v2: visionEnabled flag gates image attach across all runtimes
cryptopoly May 1, 2026
174f47b
Phase 2.6 cross-platform RAG: semantic embedding via llama-embedding …
cryptopoly May 1, 2026
260c64e
Wire --mmproj for llama.cpp vision: sibling detection + visionEnabled…
cryptopoly May 1, 2026
91965e5
Phase 2.10 MCP client: stdio JSON-RPC + tool adapter + provenance
cryptopoly May 1, 2026
ce53f28
Phase 2.8 structured tool output: tools render as table / code / mark…
cryptopoly May 1, 2026
07dd06c
Phase 2.4 conversation branching: fork from any assistant message
cryptopoly May 1, 2026
f583d42
Phase 2.5 in-thread compare: sibling variants under assistant bubble
cryptopoly May 2, 2026
b26e58d
Phase 2.11 capability badges: typed flags surface across all model pi…
cryptopoly May 2, 2026
3a37e77
Phase 2.2 close-out: JSON-schema constrained-output opt-in
cryptopoly May 2, 2026
db1acce
Phase 2.7 prompt presets + variables: fill-form before Use in Chat
cryptopoly May 2, 2026
e294021
Phase 2.13 OpenAI-compatible server: full sampler chain + embeddings
cryptopoly May 2, 2026
8907709
Phase 2.14 catalog browser: VRAM-fit hints on Discover variants
cryptopoly May 2, 2026
26bc0b7
Reasoning panel: collapsible streaming preview + close first-paragrap…
cryptopoly May 2, 2026
0d8b7f2
Phase 3.4 substrate routing inspector: per-turn badge above metrics
cryptopoly May 2, 2026
7c369ff
Phase 3.2 KV strategy chip: per-turn cache override in composer
cryptopoly May 2, 2026
e343fbe
Phase 3.8 chat-template inspection: detect Gemma + ChatML quirks
cryptopoly May 2, 2026
c510b4d
Phase 3.5 cross-platform perf telemetry: per-turn host strip
cryptopoly May 2, 2026
f969a4f
Phase 3.6 Delve mode: critic-pass on assistant messages
cryptopoly May 2, 2026
7207113
Phase 3.7 workspace knowledge stacks: shared RAG corpus across sessions
cryptopoly May 2, 2026
67807b5
Phase 3.3 logprobs viz (advanced-mode gated): per-message confidence …
cryptopoly May 2, 2026
9237355
Phase 3.1 DDTree accepted-token overlay: substrate truth view
cryptopoly May 2, 2026
1723a38
KV chip + DFlash UX hotfixes from smoke test feedback
cryptopoly May 2, 2026
db861fa
Phase 3.1 + 3.8 follow-ups: DDTree-tree spans + llama.cpp chat-templa…
cryptopoly May 2, 2026
e4f44c2
Phase 3.3 follow-up: MLX logprobs passthrough on streaming path
cryptopoly May 2, 2026
a43edb9
FU-015..FU-021: image+video perf bundle (FBCache, SDXL VAE fp16, dist…
cryptopoly May 3, 2026
2401c78
Wire STG slider through to mlx-video subprocess + preset-row-pair styles
cryptopoly May 3, 2026
23447c7
Bump version to 0.7.4
cryptopoly May 3, 2026
80c0874
KV cache chip: harmonize filter with launch-settings modal
cryptopoly May 3, 2026
af61e82
FU-001 close-out: bump turboquant-mlx-full to >=0.3.0
cryptopoly May 3, 2026
676ebd8
Audit phases 1-4 + multimodal images + Gemma 4 channel filter
cryptopoly May 4, 2026
1110e6f
Phase 5 frontend UX: previewVae toggles + kvBudget schema
cryptopoly May 4, 2026
3e40152
Bug 2.1 + CLI runner: Gemma 4 asymmetric channel filter
cryptopoly May 4, 2026
f5684aa
Phase 7 v1: mlx-video Wan convert foundation (FU-025)
cryptopoly May 4, 2026
9d959a4
Phase 8: mlx-video Wan runtime routing (FU-025 closeout)
cryptopoly May 4, 2026
6bb562b
Phase 9: GUI install action for Wan MLX runtime (FU-025 fully closed)
cryptopoly May 4, 2026
e8e1c27
Restore pre-aec1975 card layout for Image/Video Discover + My Models
cryptopoly May 4, 2026
1017ccb
[mlx-vlm] add torchvision dep for Qwen2.5-VL processor build
cryptopoly May 4, 2026
e228e41
Restore catalog tabs to v0.7.2 layout exactly + drop duplicate Wan panel
cryptopoly May 4, 2026
bcf88de
FU-009 close-out: live Wan2.1 MLX smoke + status_for upstream-layout fix
cryptopoly May 4, 2026
9d15842
FU-018 part 1 close-out: preview VAE swap validated end-to-end
cryptopoly May 4, 2026
15b3fe5
FU-006 quarterly re-verify: hold at f825ffb (v0.1.4.1)
cryptopoly May 4, 2026
412d7a6
FU-018 part 2: live denoise thumbnails via callback_on_step_end
cryptopoly May 4, 2026
f08e45c
FU-022: LLM-based prompt enhancer (Apple Silicon)
cryptopoly May 4, 2026
fe34a2c
Restore Wan MLX runtime install UX surface (FU-025 part 9)
cryptopoly May 5, 2026
ddec20d
FU-006 close-out: dflash-mlx pin bump f825ffb -> 8d8545d (v0.1.4.1 ->…
cryptopoly May 5, 2026
bc12d5c
FU-023 + FU-024 + FU-027: CUDA quantization foundations
cryptopoly May 5, 2026
7c0dbc2
FU-024: Studio FP8 layerwise toggle in Image + Video Studio
cryptopoly May 5, 2026
9c62887
Add Windows PowerShell ports of build-llama-turbo + build-sdcpp
cryptopoly May 5, 2026
d0d4f3c
Windows ps1: replace em-dash with ASCII -- so PowerShell parses cleanly
cryptopoly May 5, 2026
f5ef002
Pick a CMake generator explicitly in build-llama-turbo.ps1
cryptopoly May 5, 2026
ee1e3a4
Wipe stale CMake cache when build-llama-turbo switches generator
cryptopoly May 5, 2026
40f8640
Drop -SimpleMatch from CMake cache generator probe
cryptopoly May 5, 2026
861a81a
Detect missing MSVC up front in build-llama-turbo.ps1
cryptopoly May 5, 2026
ee49c4e
Accept VS Build Tools installs that report isComplete=0
cryptopoly May 5, 2026
3a89cf7
Append version= to CMAKE_GENERATOR_INSTANCE for unregistered installs
cryptopoly May 5, 2026
f6c4aea
Auto-sync CUDA VS integration before cmake configure
cryptopoly May 5, 2026
313dd8e
Fix CUDA-integration elevated copy and invalidate stale CMake cache
cryptopoly May 5, 2026
a8a360d
Extract Windows MSVC/CUDA helpers and apply to build-sdcpp.ps1
cryptopoly May 5, 2026
2ce995b
Use python -m pip in build.ps1 to dodge Windows self-upgrade refusal
cryptopoly May 5, 2026
74a1fa6
Diagnose T5EncoderModel error and right-size CogVideoX footprints
cryptopoly May 5, 2026
b352258
Surface CPU torch on CUDA host + raise chat default maxTokens to 4096
cryptopoly May 5, 2026
e6aa419
Fix Studio cache preview returning 0 GB on chat model selection
cryptopoly May 5, 2026
4c5cd79
Make chat cache-fit warning VRAM-aware on CUDA hosts
cryptopoly May 5, 2026
a77f738
Merge branch 'feature/chat-level-up' of https://github.com/cryptopoly…
cryptopoly May 5, 2026
94c6bf0
Run T5 lazy-import diagnostic on generate paths too
cryptopoly May 5, 2026
25bbe0c
Fix Video Studio dropping GPU warning + add inline Install button
cryptopoly May 5, 2026
d78aaa4
Add expandable per-attempt log under Install CUDA torch button
cryptopoly May 5, 2026
5e016fe
Make Install CUDA torch self-debugging + add Restart prompt
cryptopoly May 5, 2026
a047896
Remove Convert Model action + nudge My Models row icons left
cryptopoly May 5, 2026
65f807e
Fix Windows diffusion runtime readiness
cryptopoly May 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Pin line endings on text files so cross-platform contributors don't
# see phantom "modified" diffs from autocrlf-driven CRLF<->LF flips.
#
# Background: Windows users with `core.autocrlf=true` (the Git for
# Windows default) see Cargo.toml / tauri.conf.json / etc. as modified
# the moment they `git checkout` because the working-tree copy gets
# rewritten with CRLF while origin's blobs are LF. Without this file,
# every status check on Windows lights those up as dirty even though
# no real change was made. With this file, git normalizes them on the
# way in and out and the status stays clean.

# Default: treat as text, normalize to LF in the index. The working
# tree gets the platform's native line ending on checkout (LF on
# macOS/Linux, LF on Windows-with-`core.eol=lf`, CRLF on
# Windows-with-default-config).
* text=auto

# Repo-shape files MUST stay LF in the working tree everywhere -- the
# Tauri / Cargo / npm toolchains all read them with LF assumptions
# even on Windows, and a CRLF-shaped tauri.conf.json caused real
# parse failures earlier in the project history (see the patch-
# tauri-conf.mjs script's "self-heal an empty/corrupt JSON" branch).
*.toml text eol=lf
*.json text eol=lf
*.yml text eol=lf
*.yaml text eol=lf
*.md text eol=lf

# Source files: LF everywhere. Vite + tsc handle either, but pinning
# avoids whitespace-only diffs in PRs.
*.ts text eol=lf
*.tsx text eol=lf
*.js text eol=lf
*.jsx text eol=lf
*.mjs text eol=lf
*.cjs text eol=lf
*.py text eol=lf
*.rs text eol=lf
*.css text eol=lf
*.html text eol=lf

# Shell scripts: LF (would otherwise silently break on macOS / Linux
# with "bad interpreter" errors when bash sees \r in the shebang).
*.sh text eol=lf

# PowerShell: CRLF. The PS 5.1 parser handles either but PowerShell
# scripts authored on Windows traditionally ship CRLF, and Windows
# editors would otherwise rewrite them on save and produce noise.
*.ps1 text eol=crlf
*.psm1 text eol=crlf
*.psd1 text eol=crlf

# Binary blobs that Git would otherwise try to diff/normalize. Mark
# them explicitly so a `text=auto` heuristic mistake can't corrupt
# them on a cross-platform clone.
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.webp binary
*.ico binary
*.icns binary
*.woff binary
*.woff2 binary
*.ttf binary
*.otf binary
*.zip binary
*.gz binary
*.tar binary
*.7z binary
*.exe binary
*.dll binary
*.so binary
*.dylib binary
*.pyd binary
*.safetensors binary
*.gguf binary
*.bin binary
*.onnx binary
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -265,7 +265,7 @@ jobs:
tagName: ${{ inputs.release_tag || github.ref_name }}
tauriScript: npx tauri
args: --bundles ${{ matrix.bundle_targets }} --ci
includeUpdaterJson: false
includeUpdaterJson: true
updaterJsonPreferNsis: false

publish-manifest:
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,5 @@ assets/
src-tauri/gen/
.env
.env.local
.claude
.claude
AGENTS.md
31 changes: 22 additions & 9 deletions CLAUDE.md

Large diffs are not rendered by default.

27 changes: 26 additions & 1 deletion backend_service/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,13 @@ class ToolCallResult:
arguments: dict[str, Any]
result: str
elapsed_seconds: float
# Phase 2.8: optional structured output the frontend can render
# natively (table / code / markdown / image / chart). When None,
# the legacy collapsible-JSON renderer fires. The `result` text
# field is always populated so the language model sees something
# readable on the next turn regardless of UI rendering.
render_as: str | None = None
data: dict[str, Any] | None = None


@dataclass
Expand Down Expand Up @@ -108,8 +115,19 @@ def _execute_tool_call(
)

start = time.perf_counter()
render_as: str | None = None
structured_data: dict[str, Any] | None = None
try:
result_text = tool.execute(**arguments)
# Phase 2.8: try the structured entry first. Tools that
# haven't migrated return None and we fall back to the
# plain-text path below.
structured = tool.execute_structured(**arguments)
if structured is not None:
result_text = structured.text
render_as = structured.render_as
structured_data = structured.data
else:
result_text = tool.execute(**arguments)
except Exception as exc:
result_text = f"Error executing {tool_name}: {exc}"
elapsed = round(time.perf_counter() - start, 3)
Expand All @@ -122,6 +140,8 @@ def _execute_tool_call(
arguments=arguments,
result=result_text,
elapsed_seconds=elapsed,
render_as=render_as,
data=structured_data,
)


Expand Down Expand Up @@ -384,6 +404,11 @@ def run_agent_loop_streaming(
"name": tc_result.tool_name,
"result": tc_result.result[:2000], # Cap for streaming
"elapsed": tc_result.elapsed_seconds,
# Phase 2.8: stream the structured shape so the
# frontend can render it as the tool finishes
# rather than waiting for the final done payload.
"renderAs": tc_result.render_as,
"data": tc_result.data,
},
}

Expand Down
84 changes: 80 additions & 4 deletions backend_service/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@
CHAT_SESSIONS_PATH = DATA_LOCATION.chat_sessions_path
LIBRARY_CACHE_PATH = DATA_LOCATION.data_dir / "library_cache.json"
DOCUMENTS_DIR = DATA_LOCATION.documents_dir
WORKSPACES_PATH = DATA_LOCATION.workspaces_path
WORKSPACES_DIR = DATA_LOCATION.workspaces_dir
IMAGE_OUTPUTS_DIR = DATA_LOCATION.image_outputs_dir
VIDEO_OUTPUTS_DIR = DATA_LOCATION.video_outputs_dir
MAX_DOC_SIZE_BYTES = 50 * 1024 * 1024 # 50 MB per file
Expand Down Expand Up @@ -351,6 +353,20 @@ def _generate_image_artifacts(
logger.info("Generating image: model=%s repo=%s size=%dx%d steps=%d draft=%s",
variant.get("name"), variant.get("repo"), effective_width, effective_height, request.steps, request.draftMode)
runtime_manager = runtime_manager or ImageRuntimeManager()
# FU-019: variant-declared defaults override schema defaults only
# when the user hasn't moved the slider. Schema defaults (24 steps,
# CFG 5.5) come from ImageGenerationRequest in models/__init__.py.
SCHEMA_DEFAULT_STEPS = 24
SCHEMA_DEFAULT_GUIDANCE = 5.5
effective_steps = request.steps
effective_guidance = request.guidance
variant_default_steps = variant.get("defaultSteps")
variant_cfg_override = variant.get("cfgOverride")
if variant_default_steps is not None and request.steps == SCHEMA_DEFAULT_STEPS:
effective_steps = int(variant_default_steps)
if variant_cfg_override is not None and abs(request.guidance - SCHEMA_DEFAULT_GUIDANCE) < 1e-3:
effective_guidance = float(variant_cfg_override)

rendered_images, runtime_status = runtime_manager.generate(
ImageGenerationConfig(
modelId=request.modelId,
Expand All @@ -360,15 +376,39 @@ def _generate_image_artifacts(
negativePrompt=request.negativePrompt or "",
width=effective_width,
height=effective_height,
steps=request.steps,
guidance=request.guidance,
steps=effective_steps,
guidance=effective_guidance,
batchSize=request.batchSize,
seed=request.seed,
qualityPreset=request.qualityPreset,
sampler=request.sampler,
ggufRepo=(variant.get("ggufRepo") or None),
ggufFile=(variant.get("ggufFile") or None),
runtime=(variant.get("engine") or None),
cacheStrategy=request.cacheStrategy,
cacheRelL1Thresh=request.cacheRelL1Thresh,
cfgDecay=request.cfgDecay,
previewVae=request.previewVae,
# FU-019: variant-declared LoRA + step / guidance overrides.
# When the catalog variant pins a Hyper-SD / FLUX-Turbo /
# lightx2v LoRA, the engine fuses it into the pipeline at
# load time. ``defaultSteps`` / ``cfgOverride`` substitute
# only when the user kept the schema defaults — explicit
# slider tweaks survive untouched.
loraRepo=(variant.get("loraRepo") or None),
loraFile=(variant.get("loraFile") or None),
loraScale=(variant.get("loraScale") if variant.get("loraScale") is not None else None),
defaultSteps=(variant.get("defaultSteps") if variant.get("defaultSteps") is not None else None),
cfgOverride=(variant.get("cfgOverride") if variant.get("cfgOverride") is not None else None),
# FU-023: variant-pinned Nunchaku SVDQuant snapshot. Threads
# through to ``_ensure_pipeline`` which prefers it over
# NF4 / int8wo on CUDA when nunchaku is installed.
nunchakuRepo=(variant.get("nunchakuRepo") or None),
nunchakuFile=(variant.get("nunchakuFile") or None),
# FU-024: opt-in FP8 layerwise casting. Threaded from the
# request rather than the catalog so users can experiment
# without the catalog committing to fp8 readiness per repo.
fp8LayerwiseCasting=request.fp8LayerwiseCasting,
)
)
created_at = datetime.utcnow().replace(microsecond=0).isoformat() + "Z"
Expand Down Expand Up @@ -425,6 +465,21 @@ def _generate_video_artifact(
request.steps,
)

# FU-019: variant-declared step / CFG defaults override schema
# defaults only when the user kept the schema defaults — explicit
# slider movement on the frontend is preserved untouched. The
# video schema default is steps=50 (see VideoGenerationRequest).
SCHEMA_DEFAULT_STEPS = 50
SCHEMA_DEFAULT_GUIDANCE = 3.0
effective_steps = request.steps
effective_guidance = request.guidance
variant_default_steps = variant.get("defaultSteps")
variant_cfg_override = variant.get("cfgOverride")
if variant_default_steps is not None and request.steps == SCHEMA_DEFAULT_STEPS:
effective_steps = int(variant_default_steps)
if variant_cfg_override is not None and abs(request.guidance - SCHEMA_DEFAULT_GUIDANCE) < 1e-3:
effective_guidance = float(variant_cfg_override)

video, runtime_status = runtime_manager.generate(
VideoGenerationConfig(
modelId=request.modelId,
Expand All @@ -436,8 +491,8 @@ def _generate_video_artifact(
height=request.height,
numFrames=request.numFrames,
fps=request.fps,
steps=request.steps,
guidance=request.guidance,
steps=effective_steps,
guidance=effective_guidance,
seed=request.seed,
ggufRepo=(variant.get("ggufRepo") or None),
ggufFile=(variant.get("ggufFile") or None),
Expand All @@ -447,6 +502,27 @@ def _generate_video_artifact(
enableLtxRefiner=request.enableLtxRefiner,
enhancePrompt=request.enhancePrompt,
cfgDecay=request.cfgDecay,
stgScale=request.stgScale,
previewVae=request.previewVae,
# FU-019: variant-declared LoRA + override metadata.
loraRepo=(variant.get("loraRepo") or None),
loraFile=(variant.get("loraFile") or None),
loraScale=(variant.get("loraScale") if variant.get("loraScale") is not None else None),
defaultSteps=(variant.get("defaultSteps") if variant.get("defaultSteps") is not None else None),
cfgOverride=(variant.get("cfgOverride") if variant.get("cfgOverride") is not None else None),
# Phase 3 / Wan2.2-Distill 4-step: catalog-pinned distilled
# transformers replace both Wan A14B experts at pipeline load.
distillTransformerRepo=(variant.get("distillTransformerRepo") or None),
distillTransformerHighNoiseFile=(variant.get("distillTransformerHighNoiseFile") or None),
distillTransformerLowNoiseFile=(variant.get("distillTransformerLowNoiseFile") or None),
distillTransformerPrecision=(variant.get("distillTransformerPrecision") or None),
# FU-023 / FU-024: catalog-pinned Nunchaku snapshot + opt-in
# FP8 layerwise casting (CUDA-only). Same shape as the image
# side so a future video-Nunchaku release lands without app
# plumbing churn.
nunchakuRepo=(variant.get("nunchakuRepo") or None),
nunchakuFile=(variant.get("nunchakuFile") or None),
fp8LayerwiseCasting=request.fp8LayerwiseCasting,
)
)

Expand Down
Loading