Indexed multi-material batching for GLTF PBR and legacy materials#322
Indexed multi-material batching for GLTF PBR and legacy materials#322RyeMutt wants to merge 9 commits into
Conversation
Extend the per-vertex texture-index batching (previously diffuse-only) to GLTF
PBR materials, so one draw call can cover multiple materials selected per-vertex
by the texture_index attribute. Static opaque faces only for now.
- sIndexedGLTFChannels = clamp(texUnits/4, 1, 8); zeroed if the indexed shader
fails to load so everything degrades to the scalar path.
- genDrawInfo gains a batch_gltf accumulation branch that dedups up to N distinct
materials into per-face slots; registerFace builds mGLTFMaterialList parallel
to mTextureList (bypassing the per-material-id batch break for indexed faces).
- Indexed draw infos stay in PASS_GLTF_PBR (flagged by mGLTFMaterialList.size()>1)
rather than a new pass, so shadow/probe sweeps still cover them (depth-only) and
only the main opaque GBuffer pass splits scalar vs indexed.
- New pbropaqueIndexed{V,F}.glsl: slot s binds its four maps to units
[s,N+s,2N+s,3N+s]; per-slot KHR transforms (VS arrays) and scalar factors (FS
arrays); reuses position.w for the slot, no new vertex attribute.
- RenderGLTFPBRBatching setting (default on) for runtime A/B comparison.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extend indexed batching to GLTF alpha-mask materials alongside opaque.
- can_batch_gltf_material now accepts opaque and mask (blend still excluded); the
accumulation keeps each batch to a single alpha mode so opaque/mask faces land
in their respective passes (PASS_GLTF_PBR / PASS_GLTF_PBR_ALPHA_MASK).
- The main GBuffer scalar/indexed split now covers the alpha-mask pass too, reusing
the opaque indexed program -- the per-slot gltf_minimum_alpha array drives the
mask discard (-1 == opaque).
- Alpha-mask faces alpha-test per-face in the shadow map, so a scalar slot-0
fallback would corrupt batched cutouts. Add gDeferredShadowGLTFAlphaMaskIndexed
program + pbrShadowAlphaMaskIndexed{V,F}.glsl (base-color only, reuses
pushGLTFBatchIndexed) and split the sun-shadow alpha-mask sweep accordingly.
- Register both indexed programs in the shader unload list; the shadow split is
gated on isComplete() so a failed indexed shadow shader degrades to scalar.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extend indexed batching to rigged (skinned) meshes -- a multi-material rigged mesh now draws in one indexed call per avatar+skin instead of one per material. - Add HAS_SKIN matrix-palette paths to pbropaqueIndexedV and pbrShadowAlphaMaskIndexedV. - Build rigged variants of both indexed programs via make_rigged_variant; a shared setup_gltf_indexed_samplers() maps each variant's per-slot samplers to units. - Accumulation now runs for rigged face sets too, breaking each batch on avatar / skin-hash change (one matrix palette per skin). - Add pushRiggedGLTFBatchesScalar/Indexed + pushRiggedGLTFBatchIndexed (upload the matrix palette, then the indexed draw); wire the rigged sweep into the opaque pool and the sun-shadow alpha-mask pass for both static and rigged. - Register the skinned indexed variants in the unload list. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Record materials-per-indexed-batch via LL_PROFILE_ZONE_NUM so the draw-call reduction is visible in Tracy. - Add a base_color_only fast path to pushGLTFBatchIndexed: the shadow alpha-mask pass samples only base color, so skip the normal/ORM/emissive binds and their uniform uploads there (~4x fewer texture binds per indexed shadow batch). Threaded through pushGLTFBatchesIndexed / pushRiggedGLTFBatchesIndexed; the main GBuffer pass still binds the full set. Note: true texture-unit reclaiming (sharing default textures across slots) is not possible with the fixed per-slot sampler switch; it needs bindless / array textures, which remains the documented future direction. NVIDIA if-chain selection was already in place from phase 1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extend indexed batching to the legacy Blinn-Phong POOL_MATERIALS path (GBuffer
write), so multi-material legacy meshes draw in one call per shader mask. Static
opaque + alpha-mask, all normal/spec map combinations. Rigged and shadow follow.
- New materialIndexed{V,F}.glsl: GBuffer-write only (no forward/atmospherics), N-slot
diffuse/normal/spec lookups + per-slot scalar arrays (specular color/glossiness,
env intensity, min-alpha, emissive). Per-map texture transforms are baked into the
texcoords at buffer build, so no per-slot matrices are needed.
- gDeferredMaterialIndexedProgram[] parallels gDeferredMaterialProgram for the
non-blend (GBuffer) masks; LLGLSLShader::sIndexedLegacyMaterials gates the feature
(independent of PBR so a material-shader failure can't disable it).
- LLDrawInfo::mMaterialSlotList holds the per-slot data; genDrawInfo accumulates
distinct (diffuse, material) pairs, breaking each batch on shader-mask change (a
different mask is a different program); registerFace builds the slot list parallel
to the GLTF path.
- LLDrawPoolMaterials::renderDeferred skips multi-material infos in the scalar loop
and draws them via pushMaterialBatchIndexed (lazy-bound indexed program).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Follow-ups to the 5b+5c commit: - Drop an unused local in pushMaskBatchesIndexed (would trip -Wunused-variable). - Gate the indexed legacy shadow program on sIndexedLegacyMaterials and disable legacy batching entirely if it fails to load -- otherwise indexed mask batches would form but cast no shadow (skipped by the scalar pass, no indexed sweep). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extend indexed multi-material batching to the glow/emissive passes.
Multi-material batches whose faces have glow>0 were registered into the
glow passes as multi-material draw infos (registerFace keys its indexed
paths on slot index, not pass type) but rendered scalar with slot 0's
emissive/diffuse, giving the wrong glow on non-slot-0 faces.
PBR glow (PASS_GLTF_GLOW +rigged): split the sweep in
LLDrawPoolGLTFPBR::renderPostDeferred into scalar (skips multi-material)
and indexed under new gPBRGlowIndexedProgram(+Skinned), reusing
pushGLTFBatchIndexed on the shared GBuffer per-slot unit layout (base
color s, emissive 3N+s).
Legacy glow (PASS_GLOW +rigged): new pushEmissiveBatchesScalar /
pushEmissiveBatchesIndexed; indexed binds each slot's diffuse to unit s
under new gDeferredEmissiveIndexedProgram(+Skinned). Plain texture-index
glow still goes through the stock diffuseLookup program.
New shaders pbrglowIndexed{V,F}.glsl, emissiveIndexed{V,F}.glsl. Both
new programs are gated and kept out of the success chain: on load
failure the pass falls back to the pre-batching scalar behavior, never
disabling the core feature.
pushGLTFBatchIndexed's bool base_color_only becomes an eGLTFIndexedMaps
enum {FULL, BASE_COLOR, GLOW}: the glow pass binds base color + emissive
only, skipping the normal/ORM binds, the normal discard-level fetch and
the roughness/metallic/normal/MR uniform uploads it never samples.
Robustness: a PBR-opaque-indexed shader load failure (which zeroes
sIndexedGLTFChannels) now also unloads the legacy material indexed
programs and clears sIndexedLegacyMaterials, preserving the invariant
sIndexedLegacyMaterials => sIndexedGLTFChannels >= 2 so the whole indexed
feature (PBR, legacy, glow, shadows) disables together.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 23 minutes and 27 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (7)
📝 WalkthroughWalkthroughAdds indexed multi-material draw batching for GLTF PBR and legacy Blinn-Phong materials in the deferred renderer. New ChangesIndexed Multi-Material Batching
Sequence Diagram(s)sequenceDiagram
participant LLPipeline
participant LLVolumeGeometryManager
participant LLDrawInfo
participant LLViewerShaderMgr
participant LLRenderPass
rect rgba(100, 149, 237, 0.5)
Note over LLVolumeGeometryManager,LLDrawInfo: Geometry Build Phase
LLVolumeGeometryManager->>LLVolumeGeometryManager: can_batch_gltf_material / can_batch_legacy_material
LLVolumeGeometryManager->>LLDrawInfo: setTextureIndex(slot) per face
LLVolumeGeometryManager->>LLDrawInfo: populate mGLTFMaterialList or mMaterialSlotList
end
rect rgba(60, 179, 113, 0.5)
Note over LLViewerShaderMgr: Shader Load Phase
LLViewerShaderMgr->>LLViewerShaderMgr: compute sIndexedGLTFChannels = clamp(units/4, 1, 8)
LLViewerShaderMgr->>LLViewerShaderMgr: load gDeferredPBROpaqueIndexedProgram
LLViewerShaderMgr->>LLViewerShaderMgr: map per-slot samplers
LLViewerShaderMgr->>LLGLSLShader: sIndexedLegacyMaterials = true
end
rect rgba(210, 105, 30, 0.5)
Note over LLPipeline,LLRenderPass: Render Phase
LLPipeline->>LLRenderPass: pushGLTFBatchesScalar (skip multi-slot)
LLPipeline->>LLRenderPass: bind gDeferredPBROpaqueIndexedProgram
LLPipeline->>LLRenderPass: pushGLTFBatchesIndexed
LLRenderPass->>LLRenderPass: pushGLTFBatchIndexed (bind per-slot textures + uniforms, draw)
LLPipeline->>LLRenderPass: pushMaskBatchesIndexed (shadow pass)
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
indra/newview/llvovolume.cpp (1)
6343-6350:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEnable legacy indexing for the no-map material bucket.
Line 6234 sends legacy materials with neither normal nor specular maps to
sSimpleFaces, but only the normal/spec/normspec buckets receivebatch_legacy=true. Those validPASS_MATERIAL/ mask candidates stay scalar even when indexed legacy programs are available; pass the flag through and letcan_batch_legacy_material()filter non-material simple faces.⚡ Proposed fix
- geometryBytes += genDrawInfo(group, simple_mask | extra_mask, sSimpleFaces[i], simple_count[i], false, batch_textures, rigged); + geometryBytes += genDrawInfo(group, simple_mask | extra_mask, sSimpleFaces[i], simple_count[i], false, batch_textures, rigged, false, legacy_batch_enabled); geometryBytes += genDrawInfo(group, fullbright_mask | extra_mask, sFullbrightFaces[i], fullbright_count[i], false, batch_textures, rigged); geometryBytes += genDrawInfo(group, alpha_mask | extra_mask, sAlphaFaces[i], alpha_count[i], alpha_sort, batch_textures, rigged); geometryBytes += genDrawInfo(group, bump_mask | extra_mask, sBumpFaces[i], bump_count[i], false, false, rigged);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@indra/newview/llvovolume.cpp` around lines 6343 - 6350, The simple_mask genDrawInfo call for sSimpleFaces is missing the legacy_batch_enabled parameter that should be passed to enable legacy indexing for the no-map material bucket. Update the first genDrawInfo call (simple_mask) to include legacy_batch_enabled as the final parameter, matching the pattern used in the norm_mask, spec_mask, and normspec_mask calls just below it, so that valid PASS_MATERIAL candidates can be properly indexed when legacy programs are available.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@indra/newview/app_settings/settings_alchemy.xml`:
- Line 1241: The description string in the setting at line 1241 incorrectly
states that batching applies to "static opaque only," but the toggle now gates
broader indexed batching paths including rigged materials, alpha-masked
materials, and legacy indexed eligibility. Update the string description to
accurately reflect the broader scope of batching that this setting controls,
removing the misleading "static opaque only" limitation.
In `@indra/newview/lldrawpool.cpp`:
- Around line 1015-1033: The code uses fixed-size 8-slot arrays (roughness,
metallic, min_alpha, emissive, bc_xform, nm_xform, mr_xform, em_xform) without
validating that the input size respects this limit. Add guard assertions or
checks before array writes across four locations to ensure no buffer overruns.
In indra/newview/lldrawpool.cpp lines 1015-1033, validate that n (from
mGLTFMaterialList.size()) does not exceed both 8 and
LLGLSLShader::sIndexedGLTFChannels before the main loop. In
indra/newview/lldrawpool.cpp lines 593-607, validate the mask slot count before
writing to the min_alpha[8] array. In indra/newview/lldrawpool.cpp lines
673-680, validate the emissive slot count before binding slot texture units. In
indra/newview/lldrawpoolmaterials.cpp lines 135-153, validate the material slot
count (from mMaterialSlotList.size()) before writing per-slot arrays and binding
texture unit offsets.
In `@indra/newview/lldrawpoolpbropaque.cpp`:
- Around line 61-92: The indexed flag is being used to skip scalar
multi-material rendering before verifying that all indexed program variants are
complete, which can cause incomplete indexed programs to be used. In
indra/newview/lldrawpoolpbropaque.cpp L61-L92, add completeness checks for both
gDeferredPBROpaqueIndexedProgram.isComplete() and its rigged variant before the
indexed conditional that calls pushGLTFBatchesScalar and
pushRiggedGLTFBatchesScalar. In indra/newview/lldrawpoolpbropaque.cpp L114-L143,
add the gPBRGlowIndexedProgram rigged variant completeness check before the
indexed conditional that routes rigged glow. In
indra/newview/lldrawpoolsimple.cpp L63-L96, add the
gDeferredEmissiveIndexedProgram rigged variant completeness check before setting
glow_indexed for both static and rigged legacy glow paths. Ensure all indexed
program completeness is verified before allowing scalar sweeps to skip
multi-material infos.
In `@indra/newview/llviewershadermgr.cpp`:
- Around line 1406-1467: The static flag LLGLSLShader::sIndexedLegacyMaterials
can retain a true value from a previous successful shader load, causing a stale
state when reloading. If the outer condition (LLGLSLShader::sIndexedGLTFChannels
>= 2) is false or material_indexed_ok fails, the flag remains true while the
indexed material programs are unloaded or never compiled, creating an
inconsistent state. Reset LLGLSLShader::sIndexedLegacyMaterials to false before
the material_indexed_ok block starts (at the beginning of the outer if
statement) to ensure a clean state for each load attempt.
In `@indra/newview/llvovolume.cpp`:
- Around line 5558-5567: The legacy slot key only includes material and diffuse
texture, but the slot data includes mSpecColor and mEnvIntensity which are
derived from the TE shiny value (via spec_lut[shiny & TEM_SHINY_MASK]). This
causes faces with the same material and diffuse but different shiny values to
reuse the same slot, overwriting earlier vertex data. Include the TE shiny value
in the legacy slot key generation at both affected locations (around lines
5558-5567 and 6742-6761) to ensure each unique combination of material, diffuse,
and shiny gets its own slot.
- Around line 5526-5533: The scalar rendering path needs to reset the texture
index sentinel before registerFace infers indexed mode. Currently, the code
checks if index is less than FACE_DO_NOT_BATCH_TEXTURES to determine indexed
batching (in both the gltf_indexed and legacy_indexed boolean assignments), but
scalar faces from a prior indexed rebuild may retain a material slot. Clear the
sentinel value for scalar faces before these index comparisons occur in
registerFace to ensure that scalar rendering properly degrades instead of
incorrectly populating indexed material lists when RenderGLTFPBRBatching is
disabled or during shader fallback.
---
Outside diff comments:
In `@indra/newview/llvovolume.cpp`:
- Around line 6343-6350: The simple_mask genDrawInfo call for sSimpleFaces is
missing the legacy_batch_enabled parameter that should be passed to enable
legacy indexing for the no-map material bucket. Update the first genDrawInfo
call (simple_mask) to include legacy_batch_enabled as the final parameter,
matching the pattern used in the norm_mask, spec_mask, and normspec_mask calls
just below it, so that valid PASS_MATERIAL candidates can be properly indexed
when legacy programs are available.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: be82a551-51c5-405b-a4b9-66bcfada856e
⛔ Files ignored due to path filters (12)
indra/newview/app_settings/shaders/class1/deferred/emissiveIndexedF.glslis excluded by!**/*.glslindra/newview/app_settings/shaders/class1/deferred/emissiveIndexedV.glslis excluded by!**/*.glslindra/newview/app_settings/shaders/class1/deferred/materialIndexedF.glslis excluded by!**/*.glslindra/newview/app_settings/shaders/class1/deferred/materialIndexedV.glslis excluded by!**/*.glslindra/newview/app_settings/shaders/class1/deferred/materialShadowIndexedF.glslis excluded by!**/*.glslindra/newview/app_settings/shaders/class1/deferred/materialShadowIndexedV.glslis excluded by!**/*.glslindra/newview/app_settings/shaders/class1/deferred/pbrShadowAlphaMaskIndexedF.glslis excluded by!**/*.glslindra/newview/app_settings/shaders/class1/deferred/pbrShadowAlphaMaskIndexedV.glslis excluded by!**/*.glslindra/newview/app_settings/shaders/class1/deferred/pbrglowIndexedF.glslis excluded by!**/*.glslindra/newview/app_settings/shaders/class1/deferred/pbrglowIndexedV.glslis excluded by!**/*.glslindra/newview/app_settings/shaders/class1/deferred/pbropaqueIndexedF.glslis excluded by!**/*.glslindra/newview/app_settings/shaders/class1/deferred/pbropaqueIndexedV.glslis excluded by!**/*.glsl
📒 Files selected for processing (13)
indra/llrender/llglslshader.cppindra/llrender/llglslshader.hindra/newview/app_settings/settings_alchemy.xmlindra/newview/lldrawpool.cppindra/newview/lldrawpool.hindra/newview/lldrawpoolmaterials.cppindra/newview/lldrawpoolpbropaque.cppindra/newview/lldrawpoolsimple.cppindra/newview/llspatialpartition.hindra/newview/llviewershadermgr.cppindra/newview/llviewershadermgr.hindra/newview/llvovolume.cppindra/newview/pipeline.cpp
CodeRabbit findings on the indexed multi-material batching: - Reset the scalar anchor face index in genDrawInfo before registerFace, so a stale material slot from a prior indexed rebuild cannot make a scalar face be treated as indexed (e.g. after toggling RenderGLTFPBRBatching off or shader fallback). - Include TE shiny in the legacy slot dedup key: spec color/env derive from shiny when there is no specular map, so faces sharing (material, diffuse) but differing in shiny must not share a slot. - Clamp/assert per-batch slot counts (n <= sIndexedGLTFChannels <= 8) before writing the fixed 8-slot arrays or binding slot texture units in the PBR/legacy/glow/material indexed push helpers. - Gate the scalar-skip + indexed dispatch on completeness of BOTH the indexed program and its rigged variant (PBR deferred, PBR glow, legacy glow); otherwise fall back to full scalar. - Reset sIndexedLegacyMaterials to false before the optional legacy-indexed load block so a stale true from a prior load cannot survive a skipped/failed reload. - Update the RenderGLTFPBRBatching setting description to reflect the full scope (PBR + legacy; opaque/mask/rigged/shadow/glow). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Description
Extends the viewer's per-vertex texture-index batching — previously diffuse-only — so a single draw call can cover multiple materials, for both GLTF PBR and legacy Blinn-Phong materials, across the opaque, alpha-mask, rigged, shadow, and glow/emissive passes. Multi-material objects and meshes that previously emitted one draw call per material now batch into one, cutting draw-call count and CPU-side bind/setup overhead.
Mechanism. Each vertex carries a material slot index (reusing the existing
texture_indexpacked inposition.w). Hand-written indexed shaders select that slot's samplers via an if-chain gated on aGLTF_INDEXED_CHANNELSdefine (if-chain, notswitch, for driver portability). Indexed draw infos coexist with scalar ones in the existing render passes — flagged bymGLTFMaterialList.size() > 1(PBR) /mMaterialSlotList.size() > 1(legacy) — so no new pass types are introduced and shadow/probe/cull coverage is unchanged. Per-slot material maps bind to texture units[s, N+s, 2N+s, 3N+s]; per-slot scalars/transforms arrive as uniform arrays indexed by the slot.Scope, by phase (one commit each):
PASS_GLTF_GLOW+ legacyPASS_GLOW, static + rigged; plus aneGLTFIndexedMapsfast path (the glow sweep binds base-color + emissive only, skipping the normal/ORM binds and the normal-map discard-level fetch) and a shader-load robustness guardGraceful degradation. Every indexed shader program is gated and kept out of the main shader-load
successchain. If any fails to compile, that path falls back to the existing scalar rendering — never a crash, never disabling the core viewer. A PBR-opaque-indexed load failure additionally tears down the legacy indexed programs so the whole feature disables together (invariant:sIndexedLegacyMaterials => sIndexedGLTFChannels >= 2). Runtime A/B toggle:RenderGLTFPBRBatching(default on).Related Issues
No tracking issue — internal rendering performance work.
Issue Link: n/a
Checklist
Additional Notes
Verification status (runtime, visual A/B via
RenderGLTFPBRBatching):Reflection probes: verified to already capture indexed batches correctly — probe capture (
display_cube_face→renderGeomDeferred→ poolrenderDeferred) shares the main GBuffer path and the same scalar/indexed split, so per-slot materials and cutouts are correct in probes with no probe-specific code.Known limitation: a batch is capped at 8 materials by the fixed per-slot sampler switch; lifting that (and sharing units across slots) would require bindless or array textures — a separate, larger change. Documented as future work.
Reviewer notes: the indexed shaders mirror their scalar counterparts (
pbropaqueIndexed*,materialIndexed*,pbrShadowAlphaMaskIndexed*,materialShadowIndexed*,pbrglowIndexed*,emissiveIndexed*). The shadow sweep for batched objects stays scalar by design except for alpha-mask, which has dedicated indexed programs so per-face cutouts cast correct shadows.