Skip to content

FROMLIST: Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling#577

Open
shuaz-shuai wants to merge 1 commit into
qualcomm-linux:qcom-6.18.yfrom
shuaz-shuai:wake_ssr_timer
Open

FROMLIST: Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling#577
shuaz-shuai wants to merge 1 commit into
qualcomm-linux:qcom-6.18.yfrom
shuaz-shuai:wake_ssr_timer

Conversation

@shuaz-shuai
Copy link
Copy Markdown

When a Bluetooth controller encounters a coredump, it triggers the Subsystem Restart (SSR) mechanism. The controller first reports the coredump data and, once the upload is complete, sends a hw_error event. The host relies on this event to proceed with subsequent recovery actions.

If the host has not finished processing the coredump data when the hw_error event is received, it waits until either the processing is complete or the 8-second timeout expires before handling the event.

The current implementation clears QCA_MEMDUMP_COLLECTION using clear_bit(), which does not wake up waiters sleeping in wait_on_bit_timeout(). As a result, the waiting thread may remain blocked until the timeout expires even if the coredump collection has already completed.

Fix this by clearing QCA_MEMDUMP_COLLECTION with
clear_and_wake_up_bit(), which also wakes up the waiting thread and allows the hw_error handling to proceed immediately.

Test case:

  • Trigger a controller coredump using: hcitool cmd 0x3f 0c 26
  • Tested on QCA6390.
  • Capture HCI logs using btmon.
  • Verify that the delay between receiving the hw_error event and initiating the power-off sequence is reduced compared to the timeout-based behavior.

Reviewed-by: Bartosz Golaszewski bartosz.golaszewski@oss.qualcomm.com
Reviewed-by: Paul Menzel pmenzel@molgen.mpg.de
Link: https://lore.kernel.org/all/20260410095443.4167332-1-shuai.zhang@oss.qualcomm.com/

CRs-Fixed: 4498534

@shuaz-shuai shuaz-shuai requested review from a team, jingyiwang42, ndechesne and yijiyang May 13, 2026 02:56
Copy link
Copy Markdown

@shashim-quic shashim-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix FROMLIST in subject.

@shuaz-shuai shuaz-shuai changed the title Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling FROMLIST: Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling May 19, 2026
…andling

When a Bluetooth controller encounters a coredump, it triggers the
Subsystem Restart (SSR) mechanism. The controller first reports the
coredump data and, once the upload is complete, sends a hw_error
event. The host relies on this event to proceed with subsequent
recovery actions.

If the host has not finished processing the coredump data when the
hw_error event is received, it waits until either the processing is
complete or the 8-second timeout expires before handling the event.

The current implementation clears QCA_MEMDUMP_COLLECTION using
clear_bit(), which does not wake up waiters sleeping in
wait_on_bit_timeout(). As a result, the waiting thread may remain
blocked until the timeout expires even if the coredump collection
has already completed.

Fix this by clearing QCA_MEMDUMP_COLLECTION with
clear_and_wake_up_bit(), which also wakes up the waiting thread and
allows the hw_error handling to proceed immediately.

Test case:
- Trigger a controller coredump using:
    hcitool cmd 0x3f 0c 26
- Tested on QCA6390.
- Capture HCI logs using btmon.
- Verify that the delay between receiving the hw_error event and
  initiating the power-off sequence is reduced compared to the
  timeout-based behavior.

Link: https://lore.kernel.org/all/20260410095443.4167332-1-shuai.zhang@oss.qualcomm.com/
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/stable/20251107033924.3707495-2-quic_shuaz%40quicinc.com
Signed-off-by: Shuai Zhang <shuai.zhang@oss.qualcomm.com>
@qcomlnxci qcomlnxci requested a review from a team May 19, 2026 02:07
@shuaz-shuai shuaz-shuai requested a review from shashim-quic May 19, 2026 02:18
@knaveen-qc
Copy link
Copy Markdown

PR #577 — validate-patch

PR: #577

Verdict Issues Detailed Report
⚠️ 0 Full report
Verdict: ⚠️ — click to expand

Patch Validation Report

PR: PR #577FROMLIST: Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling
Upstream: https://lore.kernel.org/all/20260410095443.4167332-1-shuai.zhang@oss.qualcomm.com/
Verdict: ⚠️ PARTIAL


Commit Message

Check Status Note
Subject matches upstream FROMLIST: Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling — well-formed prefix + subsystem path
Body preserves rationale Clearly explains clear_bit() not waking waiters in wait_on_bit_timeout(), the fix via clear_and_wake_up_bit(), and includes a test case
Fixes: tag present/correct ⚠️ Missing. This is a bug fix; a Fixes: <sha> ("Bluetooth: hci_qca: ...") tag referencing the commit that introduced clear_bit(QCA_MEMDUMP_COLLECTION, ...) should be present
Authorship preserved FROMLIST: rule: submitter and lore author are the same person (Shuai Zhang <shuai.zhang@oss.qualcomm.com>); Signed-off-by: is present
Backport note N/A Not a BACKPORT: commit
Co-developed-by misuse Not present; no issue
Second Link: tag ⚠️ Points to an older stable-tree series (20251107033924.3707495-2-quic_shuaz@quicinc.com, Nov 2025) from a different sender email (quic_shuaz@quicinc.com). This is patch 2/N of a different series — its relationship to the current patch is not documented in the commit message. If this is a prior stable submission by the same author, it should be noted (e.g., Link: <url> # earlier stable submission); if it is unrelated, it should be removed

Diff

File Status Notes
drivers/bluetooth/hci_qca.c Two identical, surgical substitutions: clear_bit(QCA_MEMDUMP_COLLECTION, &qca->flags)clear_and_wake_up_bit(QCA_MEMDUMP_COLLECTION, &qca->flags) at lines 1105 and 1183. Change is minimal and consistent with the stated fix

Upstream Patch Status

Commit Community Verdict
Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling ⏳ Decision Pending — network unavailable; could not fetch lore thread to verify ACK/NAK/merge status. The primary lore link (20260410095443.4167332-1-shuai.zhang@oss.qualcomm.com) is dated 2026-04-10 and the Reviewed-by: tags from Bartosz Golaszewski and Paul Menzel are positive signals, but no merge confirmation is available

Dependency Check

  • ✅ Single-patch series (message-id suffix -1-); no Depends-on: or prerequisite series mentioned
  • ✅ Only drivers/bluetooth/hci_qca.c is touched; no header changes required for this substitution

qcom-next Presence

Commit Status
FROMLIST: Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling ⏭️ Skipped — no git tooling or network access available; verify manually with git log origin/qcom-next --oneline --grep="Fix missing wakeup during SSR memdump handling"

Issues Found

  1. Missing Fixes: tag — The commit fixes a regression introduced when clear_bit(QCA_MEMDUMP_COLLECTION, ...) was added. A Fixes: <sha> ("Bluetooth: hci_qca: ...") tag is expected for upstream acceptance and stable-tree backport tracking. Add it by identifying the commit that introduced the clear_bit(QCA_MEMDUMP_COLLECTION, &qca->flags) call.

  2. Ambiguous second Link: tagLink: https://lore.kernel.org/stable/20251107033924.3707495-2-quic_shuaz%40quicinc.com references a Nov 2025 stable-tree patch (patch 2 of a series) sent from a different email address (quic_shuaz@quicinc.com). Its relationship to the current FROMLIST: submission is unexplained. Either:

    • Document it explicitly (e.g., add a comment like # prior stable submission) if it is a related earlier attempt, or
    • Remove it if it is not directly related to this patch.

Recommendation

The diff is clean and the fix is technically sound — two minimal, correct substitutions of clear_bit() with clear_and_wake_up_bit() in qca_controller_memdump(). Request two changes before merging: (1) add a Fixes: tag identifying the commit that introduced the clear_bit(QCA_MEMDUMP_COLLECTION, ...) calls, and (2) clarify or remove the second Link: tag pointing to the older stable-tree series from a different sender email.


Final Summary

  1. Lore link present: Yes — https://lore.kernel.org/all/20260410095443.4167332-1-shuai.zhang@oss.qualcomm.com/
  2. Lore link matches PR commits: Likely yes — the primary lore message-id date (2026-04-10) and author match the PR commit exactly; diff content is internally consistent with the stated fix; upstream patch could not be fetched to confirm byte-for-byte identity (network unavailable)
  3. Upstream patch status: ⏳ Decision Pending — two Reviewed-by: tags present (positive signal) but merge into mainline/stable could not be confirmed; patch is dated 2026-04-10 and may still be under review
  4. PR present in qcom-next: ⏭️ Skipped — no git or network access; verify manually

@knaveen-qc
Copy link
Copy Markdown

PR #577 — checker-log-analyzer

PR: #577
Checker run: https://github.com/qualcomm-linux/kernel-config/actions/runs/26071785371

Checker Result Summary
Checker Result Summary
checkpatch 0 errors, 0 warnings, 0 checks
dt-binding-check ⏭️ No changes in Documentation/devicetree/bindings
dtb-check ⏭️ No changes in arch/arm64/boot/dts/
sparse-check Passed (DTB warnings in log are pre-existing, not from this PR)
check-uapi-headers Passed
check-patch-compliance b4 fetch failed for the primary Link: URL
tag-check Subject starts with FROMLIST: — valid prefix for qcom-6.18.y
qcom-next-check N/A Target branch is qcom-6.18.y, not qcom-next

Detailed report: Full report

Checker analysis — click to expand

🤖 CI Checker Analysis (checker-log-analyzer)

PR: FROMLIST: Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling (#577)
Source: https://github.com/qualcomm-linux/kernel-config/actions/runs/26071785371

Checker Result Summary
checkpatch 0 errors, 0 warnings, 0 checks
dt-binding-check ⏭️ No changes in Documentation/devicetree/bindings
dtb-check ⏭️ No changes in arch/arm64/boot/dts/
sparse-check Passed (DTB warnings in log are pre-existing, not from this PR)
check-uapi-headers Passed
check-patch-compliance b4 fetch failed for the primary Link: URL
tag-check Subject starts with FROMLIST: — valid prefix for qcom-6.18.y
qcom-next-check N/A Target branch is qcom-6.18.y, not qcom-next

❌ check-patch-compliance

Root cause: The checker's b4 am fetch of the primary Link: URL (https://lore.kernel.org/all/20260410095443.4167332-1-shuai.zhang@oss.qualcomm.com/) failed or returned a result that did not match the committed diff, triggering "Something seems wrong with the provided link."

Failure details:

Checking commit: FROMLIST: Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling
Something seems wrong with the provided link. Please verify it
Try below command to run locally-
b4 am --single-message -C -l -3 https://lore.kernel.org/all/20260410095443.4167332-1-shuai.zhang@oss.qualcomm.com/
https://lore.kernel.org/stable/20251107033924.3707495-2-quic_shuaz%40quicinc.com

The commit has two Link: tags:

  • Link: https://lore.kernel.org/all/20260410095443.4167332-1-shuai.zhang@oss.qualcomm.com/ ← primary (checked by compliance script)
  • Link: https://lore.kernel.org/stable/20251107033924.3707495-2-quic_shuaz%40quicinc.com ← secondary (stable backport reference)

The compliance checker uses the first Link: to fetch the upstream patch via b4 and compare it to the committed diff. The failure means either:

  1. The lore URL is not yet indexed / the message-ID is wrong, or
  2. The fetched upstream patch content differs from what was committed (e.g. local adaptations were made without being documented).

Fix:

  1. Verify the lore URL is reachable and correct:

    b4 am --single-message -C -l -3 \
      https://lore.kernel.org/all/20260410095443.4167332-1-shuai.zhang@oss.qualcomm.com/ \
      -o /tmp/out
    • If b4 fails to fetch → the message-ID may be wrong or not yet indexed. Find the correct lore URL for this patch and update the Link: tag.
    • If b4 succeeds → compare the fetched diff to the committed diff:
      diff <(git format-patch -1 2c211b42815700b641ccba3f32d2eeec9d4ac360 --stdout \
               | awk '/^diff/,/^--$/' | grep -E '^[+-][^+-]') \
           <(awk '/^diff/,/^--$/' /tmp/out/*.mbx | grep -E '^[+-][^+-]')
  2. If the diff differs due to local adaptations (e.g. context changes for qcom-6.18.y), document the delta in the commit message body and ensure the Link: points to the exact upstream message that is the closest ancestor.

  3. If the URL is simply wrong, update it:

    git rebase -i <base_sha>   # mark commit as 'edit'
    git commit --amend          # fix the Link: line
    git rebase --continue

Reproduce locally:

b4 am --single-message -C -l -3 \
  https://lore.kernel.org/all/20260410095443.4167332-1-shuai.zhang@oss.qualcomm.com/ \
  -o /tmp/out

Verdict

1 blocker to fix before merge: resolve the check-patch-compliance failure by verifying the Link: URL is correct and that the committed diff matches the upstream patch fetched via b4. All other checkers pass or were legitimately skipped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants