Skip to content

Pi 5 hard hangs during simultaneous USB SSD read + any concurrent write (kernel 6.12.75) #7356

@brohoya

Description

@brohoya

Describe the bug

The Pi 5 hard-locks within seconds when a sustained read from a USB SSD coexists with any concurrent userspace write activity, regardless of destination. The hang is severe enough that the userspace watchdog feeder is starved and the BCM2835 hardware watchdog hard-resets the Pi after 60s. No oops, no hung_task warning, no softlockup is logged, the kernel never reaches a logging path.

Reading from the USB SSD alone (e.g. to /dev/null) is fine at ~390 MB/s. Writing to internal storage alone is fine at ~450 MB/s. The combination hangs within a few seconds, at variable points (~300 MB to ~1.2 GB transferred), suggesting a scheduler / kernel resource race rather than a deterministic threshold.

Steps to reproduce the behaviour

  1. Mount any USB SSD with a large file on it. I used a Samsung T7 PSSD (04e8:4001).
sudo mount /dev/sda1 /mnt/usb
  1. Confirm USB read alone works (Test A, should complete at ~390 MB/s):
sudo dd if=/mnt/usb/large.mkv of=/dev/null bs=4M count=2000 status=progress
  1. Confirm internal-storage write alone works (Test B, should complete at ~450 MB/s):
sudo dd if=/dev/zero of=/some/local/path/test.bin bs=4M count=2000 conv=fdatasync status=progress
  1. Combine them (Test C, Pi hard-locks within a few seconds, hardware watchdog reboots it after 60s):
sudo dd if=/mnt/usb/large.mkv of=/some/local/path/test.mkv bs=4M status=progress

The destination filesystem doesn't matter. Reproduces with:

  • dd ... oflag=direct (rules out page cache / writeback)
  • destination on tmpfs (/dev/shm): rules out NVMe / dm-crypt / any disk
  • destination on a non-encrypted partition
  • taskset -c 1,2,3 dd ... (rules out CPU0/IRQ contention with userspace)
  • USB 2.0 port (BOT mode, no UAS): rules out UAS specifically
  • The other USB 3.0 port (different xHCI controller in RP1): same hang
  • coherent_pool=16M override (default 1M): no help, possibly worse
  • rsync / cp / dd: all the same outcome

Device (s)

Raspberry Pi 5

System

https://pastebin.com/pqLCTQ3b

OS:

Raspberry Pi reference 2026-04-13
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, b80caf532a22578ab8b2cfaa2d0368080b509a31, stage2

Firmware version:

2026/02/06 14:31:40 
Copyright (c) 2012 Broadcom
version 8124798b (release) (embedded)

Kernel version:

Linux cloud 6.12.75+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.12.75-1+rpt1 (2026-03-11) aarch64 GNU/Linux

The setup: Pi 5 8GB, kernel 6.12.75+rpt-rpi-2712, Samsung T7 PSSD on UAS, NVMe Samsung 990 PRO via PCIe (encrypted with LUKS)

Logs

Because the hardware watchdog reboots the Pi cleanly, there is nothing logged about the hang itself, journalctl -k -b -1 (previous boot) shows no hung_task, no softlockup, no panic, no oops. The only artefact post-reboot is:

[   92.767630] exFAT-fs (sda1): Volume was not properly unmounted.

…confirming the SSD was force-disconnected by the hardware reset.
/proc/cmdline (firmware-injected + cmdline.txt):

reboot=w coherent_pool=1M 8250.nr_uarts=1 pci=pcie_bus_safe cgroup_disable=memory
numa_policy=interleave nvme.max_host_mem_size_mb=0 numa=fake=8 system_heap.max_order=0
iommu_dma_numa_policy=interleave smsc95xx.macaddr=… vc_mem.mem_base=0x3fc00000
vc_mem.mem_size=0x40000000 console=ttyAMA10,115200 console=tty1 root=/dev/mapper/cryptroot
rootfstype=ext4 fsck.repair=yes rootwait cfg80211.ieee80211_regdom=FR lsm=apparmor

xHCI IRQ distribution from /proc/interrupts:

137:    15935    0    0    0    rp1_irq_chip  31 Edge   xhci-hcd:usb1
142:    56574    0    0    0    rp1_irq_chip  36 Edge   xhci-hcd:usb3

All on CPU0, and echo 2 > /proc/irq/137/smp_affinity returns -EIO (known RP1 limitation, ref #6898). NVMe MSI-X queues are correctly distributed one-per-CPU.

Additional context

Possibly related issues: #5753 (xHCI dies during disk-to-disk rsync), #6055 (cmd cmplt err -71 on Pi 5), #6433 (UAS aborts under load), #7080 (RTL9210 NVMe disconnects).

What sets this report apart: I've ruled out the destination side entirely. Test E (dd USB →/dev/shm) also hard-hangs the Pi. There is no NVMe, no dm-crypt, no real disk on the destination. So the bug is NOT in any one subsystem (UAS, dm-crypt, NVMe, exFAT, page cache, coherent_pool, USB 3 LPM, CPU0 IRQ contention), all of those have been individually ruled out by the tests above. The trigger is specifically: xHCI sustained read traffic + concurrent userspace process doing actual work between syscalls. The destination just needs to not be /dev/null (which is a no-op).

The combination is what's broken. Happy to run further diagnostics like perf events, ftrace on a kgdb-attached console, kdb on serial, if a maintainer can suggest what to look for since i can reproduce it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions