displayport: skip unnecessary DSC for MST modes within link bandwidth#1025
displayport: skip unnecessary DSC for MST modes within link bandwidth#1025coleleavitt wants to merge 3 commits intoNVIDIA:mainfrom
Conversation
The MST mode validation path in compoundQueryAttachMST() unconditionally tries DSC when the device supports it, even when the link has sufficient bandwidth for the uncompressed mode. This can cause instability through MST hubs and USB-C docks that don't handle DSC negotiation well, manifesting as spurious HPD short pulses and DPCD AUX channel failures. Add a PBN pre-check before entering the DSC path: if the uncompressed mode fits within the available local link PBN, skip DSC and proceed directly to compoundQueryAttachMSTGeneric() for full validation. This mirrors the SST behavior in compoundQueryAttachSST() which only enables DSC when willLinkSupportModeSST() fails. The pre-check preserves all existing behavior for forced DSC, DSC_DUAL mode requests, and bandwidth-insufficient cases. Signed-off-by: Cole Leavitt <cole@unwrap.rs>
3291a61 to
f77ad61
Compare
kernel-open/Makefile
Outdated
| # tree, the kernel handles BTF generation natively; otherwise add PAHOLE and | ||
| # PAHOLE_AWK_PROGRAM assignments to PAHOLE_VARIABLES. | ||
| # Kernel 6.15+ uses Makefile.btf + gen-btf.sh instead of pahole-flags.sh. | ||
| PAHOLE_VARIABLES=$(if $(or $(wildcard $(KERNEL_SOURCES)/scripts/pahole-flags.sh),$(wildcard $(KERNEL_SOURCES)/scripts/Makefile.btf)),,"PAHOLE=$(AWK) '$(PAHOLE_AWK_PROGRAM)'") |
There was a problem hiding this comment.
I am worried this change breaks BTF generation support for kernels 6.7 and above. I also don't see any use of gen-btf.sh directly in Makefile.btf for kernels 6.15 and above. Can you describe what issue you ran into that made you author this commit? Also, this should be a separate PR if needed.
https://elixir.bootlin.com/linux/v6.15.11/source/scripts/Makefile.btf
Thanks in advance.
There was a problem hiding this comment.
You're right — this should be a separate PR and I agree it needs more careful validation against 6.7+.
I've split it out into #1038 and removed it from this PR. The BTF commit is no longer in this branch.
There was a problem hiding this comment.
I am worried this change breaks BTF generation support for kernels 6.7 and above. I also don't see any use of gen-btf.sh directly in Makefile.btf for kernels 6.15 and above. Can you describe what issue you ran into that made you author this commit? Also, this should be a separate PR if needed.
https://elixir.bootlin.com/linux/v6.15.11/source/scripts/Makefile.btf
Thanks in advance.
It's actually for 7.0 and above. See torvalds/linux@522397d
There was a problem hiding this comment.
Actually I don't understand this at all. It's supposed to exclude c++ from BTF generation but it doesn't do that if pahole-flags.sh is present. (that only excludes rust). Makefile.btf also seems to be a replacement for pahole-flags.sh. Is there something I'm missing here?
| // instability through MST hubs and USB-C docks, manifesting as | ||
| // spurious HPD short pulses and DPCD AUX channel failures during | ||
| // DSC capability negotiation. | ||
| // |
There was a problem hiding this comment.
From the related issue opened,
This causes instability through MST hubs and USB-C docks that don't handle DSC negotiation well, manifesting as spurious HPD short pulses, DPCD AUX channel failures, and intermittent display disconnection.
Monitor intermittently shows "input signal out of range" and disconnects
For some of these, I am not sure how DSC would be related. DSC shouldn't affect DP AUX or HPD. I suspect the "input signal out of range" seen on the monitor is likely due to the a link assessment failure that programs the wrong compression level supported by the sink and this change works around that programming error. Would you be willing to share a log collection using nvidia-bug-report.sh with nvidia_modeset.debug=1, so we can inspect the dplib logging and understand the failure with the display?
As for why we always opportunistically enable DSC in a MST configuration, that is done for a simple software management policy since in most MST setups, DSC will be required.
There was a problem hiding this comment.
Fair point on the DSC/AUX/HPD relationship — you may be right that the root cause is a link assessment failure programming the wrong compression level, and skipping DSC just sidesteps it. I'll collect a nvidia-bug-report.sh with nvidia_modeset.debug=1 and share it so we can inspect the dplib logging.
Regarding the opportunistic DSC policy for MST: understood. This change preserves that — it only skips DSC when the uncompressed mode passes the full compoundQueryAttachMSTGeneric validation (local link PBN, watermark, and per-device intermediate branch timeslots). If any check fails, it falls through to the DSC path as before.
Since the original submission, I've added two additional commits to fix bugs in the speculative pre-check:
-
Fall through to DSC on intermediate hop failure (
e91a9fb) — the original pre-check returned the generic result unconditionally, which meant if an intermediate branch had insufficient timeslots, DSC was never tried. Now it only returns on success; failure falls through to DSC. -
Save/restore compound query state (
223a169) — the speculativecompoundQueryAttachMSTGenericcall mutatescompoundQueryResultand per-devicecompound_query_state. On failure,compoundQueryResultwas poisoned (permanently false), and intermediate devices retained stale uncompressed timeslot allocations viabandwidthAllocatedForIndex. The DSC path would then either skip those devices or over-count their timeslots. Fixed by saving all state before the speculative call and restoring on failure. Also added an overflow guard — if the MST topology has more devices than the save array can hold, the speculative pre-check is skipped entirely.
The branch now has 3 clean DP commits and the BTF change has been moved to #1038.
|
@coderabbitai review |
|
@cubicdev review |
…rmediate hops The PBN pre-check only validates local (source) link bandwidth before skipping DSC. In MST daisy-chain or hub topologies, an intermediate branch link may have fewer timeslots than the local link. When compoundQueryAttachMSTGeneric fails at an intermediate branch, the mode is rejected without trying DSC — but DSC compression would reduce the PBN enough to fit through the bottleneck. Instead of returning the generic result directly, check if it succeeded: if so, return true (DSC unnecessary). If not, fall through to the DSC path which tries 10 bpp then 8 bpp max compression as fallback. Signed-off-by: Cole Leavitt <cole@unwrap.rs>
f152fa2 to
e91a9fb
Compare
… MST pre-check The speculative uncompressed pre-check calls compoundQueryAttachMSTGeneric to determine if DSC can be skipped. However, this function mutates shared state that the subsequent DSC path depends on: 1. compoundQueryResult is set to false on failure but never reset, poisoning all subsequent compoundQueryAttachMSTGeneric calls which unconditionally return the (now-false) flag. 2. Per-device compound_query_state (timeslots_used_by_query and bandwidthAllocatedForIndex) is only partially rolled back on failure. Intermediate branch devices between the source and the bottleneck retain their uncompressed timeslot allocations. When the DSC path calls compoundQueryAttachMSTGeneric again, these devices are skipped via the bandwidthAllocatedForIndex bitmask, keeping incorrect (too-large) uncompressed allocations instead of the correct (smaller) compressed ones. Fix this by saving all relevant state before the speculative call and restoring it on failure, so the DSC path always starts from a clean slate. Also clear pErrorCode on rollback since the speculative failure is not a real error.
3864300 to
223a169
Compare
Bug Report Log (nvidia_modeset.debug=1)Attached System
DP Link Status
Notes
|
|
nvidia-bug-report.log.gz (base64-encoded, 1.7MB): https://gist.github.com/coleleavitt/2fe873fd3dfaaedbac5bc21b37929bff To decode: |
Summary
Add a PBN pre-check in
compoundQueryAttachMST()to skip unnecessary DSC compression when the link has sufficient bandwidth for the uncompressed mode.Problem
The MST mode validation path unconditionally tries DSC when the device supports it, even when the link has sufficient bandwidth. This causes instability through MST hubs and USB-C docks that don't handle DSC negotiation well, manifesting as:
The SST path (
compoundQueryAttachSST) already handles this correctly by checkingwillLinkSupportModeSST()before enabling DSC.Solution
Before entering the DSC path, calculate the PBN required for the uncompressed mode and check if it fits within available local link PBN. If it does, skip DSC and proceed directly to
compoundQueryAttachMSTGeneric()for full validation.This mirrors the SST behavior without introducing state management issues (the PBN pre-check is side-effect-free).
Behavior Preserved
DSC_FORCE_ENABLE): Still enabledTesting
Tested on:
Before: Monitor intermittently disconnects every few seconds with DSC enabled at 10 bpp
After: Stable connection without DSC (mode fits in DP 1.4 HBR3 uncompressed)
Related Issues
Fixes #1024
May help with: