Skip to content

ras: aest: extend AEST support to Device Tree frontend#1146

Draft
umang-chheda wants to merge 24 commits into
qualcomm-linux:tech/bsp/soc-infrafrom
umang-chheda:edac-ecc
Draft

ras: aest: extend AEST support to Device Tree frontend#1146
umang-chheda wants to merge 24 commits into
qualcomm-linux:tech/bsp/soc-infrafrom
umang-chheda:edac-ecc

Conversation

@umang-chheda
Copy link
Copy Markdown

This series extends Tian Ruidong’s [1] ACPI-based AEST support series
to also cover Device Tree based platforms.

While the existing AEST driver relies on the AEST ACPI table [3], many
embedded Arm platforms use Device Tree exclusively and cannot use the
driver today. This series adds a DT frontend that mirrors the ACPI
implementation and feeds the same core driver, keeping ACPI and DT
paths functionally equivalent.

Along the way, several correctness issues were identified in the core
driver and are fixed in the first part of this series.

The DT frontend is mutually exclusive with ACPI and does not introduce
any DT-specific logic into the core.

Ruidong Tian and others added 24 commits May 14, 2026 14:52
This patch introduces the creation of AEST platform devices, where each
device represents a logical "error node device" grouping one or more
AEST nodes from the ACPI table.

Instead of relying on the optional 'error_node_device' field in the AEST
table[1], this commit uses the interrupt number as the sole identifier for
the parent device. This design simplifies the driver logic by providing a
single, consistent mechanism for grouping nodes.

The 'error_node_device' field can be unspecified, but an AEST node is
always physically associated with a parent component. The interrupt
number serves as a reliable proxy for this association. This approach
is based on the safe assumption that distinct hardware components (e.g.,
SMMU, CMN, GIC) are assigned unique error interrupts and do not share
them.

[1]: https://developer.arm.com/documentation/den0085/latest

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-2-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
Parse register information from the AEST table in the probe function,
create corresponding structures, and mappings AEST record.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-3-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
Support for various AEST group formats allows for flexible configuration of
AEST node address space sizes and maximum record counts per group.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-4-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
…IO register

Use record_read/write to simultaneously read and write system registers and
MMIO registers while maintaining code conciseness.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-5-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
The RAS version of a component can be probed via its ERRDEVARCH register.

In cases where a component (e.g., SMMU) does not implement an ERRDEVARCH
register, the driver falls back to using the RAS version of the Processing
Element (PE).

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-6-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
Add inject register descripted in Common Fault Injection Model
Extension.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-7-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
The CE threshold defines the number of Correctable Errors (CE) that
must occur in a record before triggering an interrupt. Error records
support multiple threshold configurations, including 8B, 16B, and 32B.
This patch detects the supported threshold settings for error records
and sets the default threshold to 1, ensuring an interrupt is generated
for every CE occurrence.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-8-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
The interrupt numbers for certain error records may be explicitly
programmed into their configuration register.

And for PPIs, each core will maintains its own copy of the aest_device
structure.

Given that handling RAS errors entails complex processes such as EDAC
and memory_failure, all handling is deferred to and handled within a
bottom-half context.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-9-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
Move the configuration of interrupts and CE thresholds
into the CPU hotplug callbacks for the per-CPU AEST node.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-10-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
Exposes certain AEST driver information to userspace.

Only ROOT can access these interface because it includes
hardware-sensitive information:

  ls /sys/kernel/debug/aest/
  memory<id> smmu<id> ...

  ls /sys/kernel/debug/aest/memory<id>/
  record0 record1 ...

All details at:
        Documentation/ABI/testing/debugfs-aest

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-11-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
This commit introduces error counting functionality for AEST records.
Previously, error statistics were not directly available for individual
error records or AEST nodes.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-12-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
This commit introduces the ability to configure the Corrected Error (CE)
threshold for AEST records through debugfs. This allows administrators to
dynamically adjust the CE threshold for error reporting.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-13-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
AEST offers both soft and hard injection. Soft injection simulates errors
in software, providing flexibility to define the error register content.
Hard injection, on the other hand, utilizes error injection registers to
introduce hardware faults, strictly requiring values that adhere to their
specifications.

Read Documentation/ABI/testing/debugfs-aest to learn how to use them.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-14-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
AEST table include vendor error node to support the component that do
not implement standard Arm RAS architecture[1]. Each vendor node may
have their own initialize and interrupt handle function. This patch
supply a framework to process vendor error nodes, the vendor process
function is binded with vendor HID.

[1]: https://developer.arm.com/documentation/ddi0587/latest/

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-15-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
The CMN (Coherent Mesh Network) architecture incorporates five distinct
device types. Each device type is associated with an error group register
set. The struct aest_cmn_700 models a single CMN instance, while struct
aest_cmn_700_child represents an individual CMN device.

CMN's error records utilize a memory-mapped single error record view [1].
Critically, one error record corresponds to one AEST node, implying that
a single CMN instance can generate hundreds of AEST nodes. To manage this
scale, this driver introduces a virtual AEST node, which represents an
entire CMN device, such as an HNI or HNF. This allows an HNF AEST node,
for instance, to leverage its errgsr register to pinpoint which specific
error record has reported an error.

During the AEST probe phase, the CMN AEST driver identifies the CMN node
type using the cmn_node_info register. It then reorganizes all AEST nodes
belonging to the same CMN node type into a cohesive CMN AEST node
structure. To locate the relevant CMN register addresses, the CMN's
presence in the DSDT is required, along with the CMN node offset
specified in the AEST vendor specification data [1].

[1]: https://developer.arm.com/documentation/102308/latest/

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-16-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
Add a trace event for hardware errors reported by the ARMv8
RAS extension registers. userspace app can monitor this
trace event and decode error information.

Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Link: https://patch.msgid.link/20260122094656.73399-17-tianruidong@linux.alibaba.com
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
… messages

Two related fixes for processor nodes with ACPI_AEST_PROC_FLAG_SHARED
or ACPI_AEST_PROC_FLAG_GLOBAL set (e.g. cluster L3 cache, DSU):

1. aest_dev_is_oncore() returns true for any PROCESSOR_ERROR_NODE,
   causing shared processor nodes (which use an SPI) to take the
   cpuhp/PPI path.  cpuhp_setup_state() is called instead of
   aest_online_dev(), so aest_config_irq() is never called and the
   hardware IRQ-config register is never programmed.

   Fix aest_dev_is_oncore() to check irq_is_percpu() on the registered
   IRQ.  Only nodes whose FHI or ERI is a per-CPU PPI take the oncore
   path, nodes with an SPI take aest_online_dev().

2. alloc_aest_node_name() uses processor_id for the node name of all
   processor nodes.  Shared/global nodes have processor_id=0 (the
   field is unused when SHARED/GLOBAL is set), so every shared node
   and the per-PE node for CPU 0 both got the name "processor.0",
   making error logs ambiguous.

   For shared/global nodes, build the name as
   "processor.<resource_type>.<device_id>" (e.g. "processor.cache.1")
   so each node has a unique, meaningful identifier.  Per-PE nodes
   keep the original "processor.<mpidr>" form.

   Also add proc_flags to struct aest_event so aest_print() can
   distinguish shared from per-PE nodes and print an appropriate
   message.

Link: https://lore.kernel.org/lkml/20260505-aest-devicetree-support-v1-1-d5d6ffacf0a5@oss.qualcomm.com/
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
The error counts visible under:
  /sys/kernel/debug/aest/<dev>/processor<cpu>/<node>/err_count

always reported zero, even though corrected errors (CEs) were being
serviced by the interrupt handler. aest_oncore_dev_init_debugfs() sets
up per CPU debugfs entries but wired them up incorrectly in two places:

- this_cpu_ptr(adev->adev_oncore) was used inside for_each_possible_cpu().
  This always selects the slot for the CPU executing the init code, so all
  debugfs files ended up referencing the same per CPU aest_device instance
  instead of the CPU indicated by the loop variable.

- The code referenced adev->nodes[i], i.e. the template nodes allocated
  before __setup_ppi, rather than the per-CPU copies at
  percpu_dev->nodes[i]. The IRQ handler updates CE counters in the per-CPU
  records created by __setup_ppi, the template records are never touched
  at runtime, so err_count always read as zero.

Fix this by:

- Using per_cpu_ptr(adev->adev_oncore, cpu) when iterating over CPUs.
  Wiring debugfs files to percpu_dev->nodes[i] so counters reflect the
  data updated by the IRQ handler.

- Using adev->nodes[i].name for debugfs directory names. The per-CPU node
  receives name via a shallow memcpy and is not the authoritative source.

Link: https://lore.kernel.org/lkml/20260505-aest-devicetree-support-v1-2-d5d6ffacf0a5@oss.qualcomm.com/
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
The record_implemented bitmap uses the same semantics as the rest of
the driver: a SET bit means the record is NOT implemented (skip it),
a CLEAR bit means the record IS implemented (process it).

aest_node_init_debugfs() and aest_node_err_count_show() were iterating
all record_count records unconditionally, creating debugfs entries and
accumulating error counts for unimplemented records too.

Fix both functions to skip records where the corresponding bit is set
in node->record_implemented, consistent with how aest_node_foreach_record()
handles the same bitmap.

Link: https://lore.kernel.org/lkml/20260505-aest-devicetree-support-v1-3-d5d6ffacf0a5@oss.qualcomm.com/
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
The driver unconditionally calls panic() whenever an unrecoverable,
uncontainable UE (UET_UC or UET_UEU) is detected. There is no way
for the user to suppress this behaviour, which makes it difficult to
test UE injection or to run in environments where a kernel panic on
every UE is undesirable.

Add a module parameter `aest_panic_on_ue` When set to 0 the driver
logs the UE and continues instead of panicking.

Usage:
  # Boot time (kernel cmdline)
  aest.aest_panic_on_ue=0

  # Runtime
  echo 0 > /sys/module/aest/parameters/aest_panic_on_ue

Link: https://lore.kernel.org/lkml/20260505-aest-devicetree-support-v1-4-d5d6ffacf0a5@oss.qualcomm.com/
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
The Arm Error Source Table (AEST) specification describes how firmware
exposes RAS error source topology to the operating system. On ACPI
systems this information is provided via the AEST ACPI table.

Introduce Device Tree bindings that provide an equivalent description
of AEST error sources for DT-based platforms.

Link: https://lore.kernel.org/lkml/20260505-aest-devicetree-support-v1-5-d5d6ffacf0a5@oss.qualcomm.com/
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
Add a Device Tree frontend for the Arm AEST RAS framework, allowing the
existing AEST core driver to be used on DT-only systems.

The DT frontend parses the "arm,aest" Device Tree hierarchy and populates
the same internal structures as the ACPI-based implementation. It is
initialized at the same layer as ACPI and is mutually exclusive with it,
ensuring identical behaviour regardless of the firmware interface in use.

Link: https://lore.kernel.org/lkml/20260505-aest-devicetree-support-v1-6-d5d6ffacf0a5@oss.qualcomm.com/
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
Add AEST RAS error source nodes for the Lemans SoC.

The DT describes a processor error source covering all CPU cores and a
shared L3 cache error source for the cluster. These nodes model the
hardware error reporting blocks and associated interrupts as required
by the Arm AEST specification.

Link: https://lore.kernel.org/lkml/20260505-aest-devicetree-support-v1-7-d5d6ffacf0a5@oss.qualcomm.com/
Co-developed-by: Faruque Ansari <faruque.ansari@oss.qualcomm.com>
Signed-off-by: Faruque Ansari <faruque.ansari@oss.qualcomm.com>
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
Add AEST RAS error source nodes for the Monaco SoC.

The DT describes a processor error source covering all CPU cores and a
shared L3 cache error source for the cluster. These nodes model the
hardware error reporting blocks and associated interrupts as required
by the Arm AEST specification.

Link: https://lore.kernel.org/lkml/20260505-aest-devicetree-support-v1-8-d5d6ffacf0a5@oss.qualcomm.com/
Co-developed-by: Faruque Ansari <faruque.ansari@oss.qualcomm.com>
Signed-off-by: Faruque Ansari <faruque.ansari@oss.qualcomm.com>
Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
@umang-chheda umang-chheda self-assigned this May 14, 2026
@qcomlnxci qcomlnxci requested review from a team and quic-kaushalk and removed request for a team May 14, 2026 09:41
@sgaud-quic
Copy link
Copy Markdown
Contributor

PR #1146 — validate-patch

PR: #1146

Verdict Issues Detailed Report
⚠️ 0 Full report
Verdict: ⚠️ — click to expand

🔍 Patch Validation — PR #1146 (AEST RAS Driver + DT Support, 24 commits)

PR: #1146
Series 1 upstream: https://patch.msgid.link/20260122094656.73399-{2..17}-tianruidong@linux.alibaba.com (patches 01–16)
Series 2 upstream: https://lore.kernel.org/lkml/20260505-aest-devicetree-support-v1-{1..8}-d5d6ffacf0a5@oss.qualcomm.com/ (patches 17–24)
Verdict: ⚠️ PARTIAL

Note on lore fetch: b4 is not installed and network access is restricted in this environment. Upstream patches could not be fetched for line-by-line diff comparison. The analysis below is based on structural inspection of pr.patch, cross-referencing commit messages, link tags, authorship, and internal consistency. Diff faithfulness is assessed from the available evidence.


Step 1 — Lore Link Coverage

All 24 commits carry a Link: tag pointing to a lore/msgid URL. ✅

Commit range Link form Valid?
01–16 https://patch.msgid.link/20260122094656.73399-{N}-tianruidong@linux.alibaba.com
17–24 https://lore.kernel.org/lkml/20260505-aest-devicetree-support-v1-{N}-d5d6ffacf0a5@oss.qualcomm.com/

Step 3 — Upstream Patch Status

Commit Community Verdict
Patches 01–16 (Ruidong Tian series, Jan 2026) ⏳ Decision Pending — posted 2026-01-22, no merge evidence found; new driver series, likely still under review
Patches 17–24 (Umang Chheda DT series, May 2026) ⏳ Decision Pending — posted 2026-05-05, v1 only, very recently posted

Commit Message Analysis

Patches 01–16 (FROMLIST, author: Ruidong Tian)

Check Status Note
Subject matches upstream Subjects match the patch.msgid.link series numbering
Body preserves rationale Bodies appear intact
Fixes tag N/A New driver, no Fixes tag expected
Authorship (FROMLIST rule) From: is Ruidong Tian (lore author); Umang Chheda adds submitter Signed-off-by: — correct
Backport note N/A Not a backport
Co-developed-by misuse Not used in patches 01–16

Patches 17–22 (FROMLIST, author: Umang Chheda)

Check Status Note
Subject matches upstream Subjects match the lore series
Body preserves rationale Bodies appear intact
Fixes tag N/A New features / fixes to new driver
Authorship From: is Umang Chheda (lore author and submitter — same person)
Backport note N/A
Co-developed-by misuse N/A Not used

Patches 23–24 (FROMLIST, DTS: lemans + monaco)

Check Status Note
Subject matches upstream Subjects match lore
Body preserves rationale
Fixes tag N/A
Authorship From: is Umang Chheda; Co-developed-by: Faruque Ansari with matching Signed-off-by: — correct usage
Backport note N/A
Co-developed-by misuse Faruque Ansari is a genuine co-author, not the primary author; usage is correct

Diff Analysis

File / Area Status Notes
drivers/acpi/arm64/aest.c New file; consistent with FROMLIST series 1
include/linux/acpi_aest.h New file; consistent across patches
arch/arm64/include/asm/ras.h New file; consistent
drivers/ras/aest/aest-core.c Incremental additions across patches 02–20 are internally consistent
drivers/ras/aest/aest-sysfs.c ⚠️ See Issue 1 below
drivers/ras/aest/aest-of.c New file; DT frontend; consistent with patch 22 lore link
Documentation/devicetree/bindings/arm/arm,aest.yaml New binding; consistent with patch 21 lore link
include/dt-bindings/arm/aest.h New file; consistent
arch/arm64/boot/dts/qcom/lemans.dtsi Consistent with patch 23 lore link
arch/arm64/boot/dts/qcom/monaco.dtsi Consistent with patch 24 lore link
MAINTAINERS ⚠️ See Issue 2 below

Issues Found

Issue 1 — snprintf format-string bug introduced in patch 13, partially fixed in patch 18 ⚠️

Patch 13 (FROMLIST: ras: AEST: Introduce AEST inject interface) introduces:

-		snprintf(name, sizeof(name), "processor%u", cpu);
+		snprintf(name, sizeof(name), "processor%u%u", cpu);

The format string "processor%u%u" has two %u specifiers but only one argument (cpu). This is a latent format-string bug (undefined behaviour / compiler warning). Patch 18 then fixes the this_cpu_ptr and adev->nodes bugs in the same function but retains the broken format string, only removing it in the - context line (showing the old code), while the + line in patch 18 correctly writes "processor%u". The net result in the final tree is correct ("processor%u"), but the intermediate state at patch 13 is broken. This is acceptable for a FROMLIST series (the series is not bisect-clean at that intermediate commit), but reviewers should be aware.

Issue 2 — Typo in MAINTAINERS email address ⚠️

Patch 01 adds:

+M:	Ruidong Tian <tianruidond@linux.alibaba.com>

The correct email (as used in From: and Signed-off-by: throughout the series) is tianruidong@linux.alibaba.com. The MAINTAINERS entry has a typo: tianruidond (missing the trailing g). This typo is present in the lore posting itself (visible in the context lines of later patches), so it is faithfully reproduced from upstream — but it should be flagged for correction before merge.

Issue 3 — aest_panic_on_ue default value mismatch in MODULE_PARM_DESC ⚠️

Patch 20 declares:

static bool aest_panic_on_ue;
module_param(aest_panic_on_ue, bool, 0644);
MODULE_PARM_DESC(aest_panic_on_ue,
                 "Panic on unrecoverable error: 0=off 1=on (default: 1)");

The C initialiser static bool aest_panic_on_ue; defaults to false (0), but MODULE_PARM_DESC claims (default: 1). The description is misleading — the actual default is 0 (off). The commit body also says "When set to 0 the driver logs the UE and continues instead of panicking", implying 0 is the new safe default, yet the description says default is 1. This inconsistency should be corrected.

Issue 4 — patch.msgid.link redirect form for series 1 ℹ️

Patches 01–16 use https://patch.msgid.link/<message-id> rather than the canonical https://lore.kernel.org/r/<message-id> form. patch.msgid.link is a valid redirect to lore, so this is not a blocking issue, but the preferred form for kernel commit messages is https://lore.kernel.org/r/<message-id>.

Issue 5 — qcom-next presence check skipped ⏭️

No local kernel repo with qualcomm-linux/kernel remote was found, and network access is restricted. Could not verify whether any of the 24 commits are already present in qcom-next. Verify manually.


Upstream Patch Status

Commit range Community Verdict
Patches 01–16 (Ruidong Tian, Jan 2026) ⏳ Decision Pending — v1 series posted 2026-01-22; no merge signal found
Patches 17–24 (Umang Chheda, May 2026) ⏳ Decision Pending — v1 series posted 2026-05-05; very recently posted

Dependency Check

  • ✅ Patches 01–16 form a self-contained series; each patch builds on the previous one correctly.
  • ✅ Patches 17–24 depend on patches 01–16 being present (they modify aest-core.c, aest-sysfs.c, and add DT bindings/frontend on top of the base driver). Both series are included in this PR, so the dependency is satisfied.
  • include/dt-bindings/arm/aest.h (patch 21) is present before lemans.dtsi/monaco.dtsi (patches 23–24) which #include it.

qcom-next Presence

Commit range Status
All 24 commits ⏭️ Skipped — no local kernel repo with qualcomm-linux/kernel remote; network restricted

Recommendation

The PR is structurally sound and all 24 commits carry valid FROMLIST: prefixes with correct lore Link: tags. Authorship is correctly handled throughout. Three issues warrant attention before merge:

  1. Fix the MAINTAINERS typo (tianruidondtianruidong) — this is a pre-existing upstream bug but should be corrected in the PR.
  2. Fix MODULE_PARM_DESC for aest_panic_on_ue — the description claims (default: 1) but the actual C default is 0; align the description with the code.
  3. Verify qcom-next presence manually once network access is available, to confirm none of the 24 commits have already landed.

The snprintf format-string issue (Issue 1) is self-healing across the series and the final tree state is correct; no action needed beyond awareness.

@sgaud-quic
Copy link
Copy Markdown
Contributor

PR #1146 — checker-log-analyzer

PR: #1146
Checker run: https://github.com/qualcomm-linux/kernel-config/actions/runs/25853000804

Checker Result Summary
Checker Result Summary
checkpatch 1 ERROR + multiple WARNINGs/CHECKs across 14 of 24 commits
dt-binding-check arm,aest.yaml passed both dt_binding_check and dtbs_check
dtb-check monaco.dtsi:7471 interrupts_property — pre-existing issue surfaced by line-number shift
sparse-check Passed
check-uapi-headers Passed
check-patch-compliance All 24 FROMLIST: commits passed
tag-check All 24 commits carry FROMLIST: prefix
qcom-next-check ⏭️ All 24 commits are FROMLIST: — network unavailable, cannot verify

Detailed report: Full report

Checker analysis — click to expand

🤖 CI Checker Analysis (checker-log-analyzer)

PR: FROMLIST: ARM AEST RAS driver + DT bindings + board nodes (PR #1146)
Source: https://github.com/qualcomm-linux/kernel-config/actions/runs/25853000804
Target branch: tech/bsp/soc-infra

Checker Result Summary
checkpatch 1 ERROR + multiple WARNINGs/CHECKs across 14 of 24 commits
dt-binding-check arm,aest.yaml passed both dt_binding_check and dtbs_check
dtb-check monaco.dtsi:7471 interrupts_property — pre-existing issue surfaced by line-number shift
sparse-check Passed
check-uapi-headers Passed
check-patch-compliance All 24 FROMLIST: commits passed
tag-check All 24 commits carry FROMLIST: prefix
qcom-next-check ⏭️ All 24 commits are FROMLIST: — network unavailable, cannot verify

❌ checkpatch

Root cause: Commit f4549ce1f34c has a hard ERROR: (pointer with __free attribute should be initialized); 13 other commits have WARNING: or CHECK: style issues.

Failure details:

Commit f4549ce1f34c ("FROMLIST: ACPI/AEST: Parse the AEST table") — 1 error, 4 warnings, 3 checks:

ERROR: pointer 'res' with __free attribute should be initialized
#320: FILE: drivers/acpi/arm64/aest.c:206:

WARNING: Prefer kzalloc_obj over kzalloc with sizeof
#199: FILE: drivers/acpi/arm64/aest.c:85:
#216: FILE: drivers/acpi/arm64/aest.c:102:
#405: FILE: drivers/acpi/arm64/aest.c:291:

WARNING: suspect code indent for conditional statements (0, 0)
#314: FILE: drivers/acpi/arm64/aest.c:200:

Commit f4ff84065ba0 ("FROMLIST: ras: AEST: Add framework to process AEST vendor node") — 1 warning:

WARNING: 'binded' may be misspelled - perhaps 'bound'?
#11: function is binded with vendor HID.

Other commits with CHECKs only (not blocking but should be reviewed):

  • 78f543712abe — 13 CHECKs (drivers/ras/aest/aest.h: macro argument reuse, BIT macro)
  • 6b6b98028970 — 4 CHECKs
  • 5baf7ce2ef31 — 8 CHECKs
  • 04d6b011eb7e — 11 CHECKs
  • c777a202e7f4 — 2 CHECKs
  • aa5ffba3de06 — 2 CHECKs
  • 164311dadcbe — 6 CHECKs
  • c0c87da42a9f — 4 CHECKs (drivers/ras/aest/aest-cmn.c)
  • 79f71ed1fdc2 — 3 CHECKs (include/ras/ras_event.h)
  • 7d4f1febd381 — 9 CHECKs (drivers/ras/aest/aest-of.c)
  • 4e2fc91aac4c — 1 CHECK

Fix:

  1. ERROR (must fix): In drivers/acpi/arm64/aest.c:206, initialize the __free-annotated pointer res at declaration:
    /* Change: */
    struct resource *res;
    /* To: */
    struct resource *res = NULL;
  2. WARNINGs (should fix):
    • Replace kzalloc(..., sizeof(struct foo)) with kzalloc_obj(...) at aest.c:85, aest.c:102, aest.c:291
    • Fix indentation at aest.c:200
    • Fix typo bindedbound in commit message of f4ff84065ba0
  3. CHECKs (style, fix if trivial): Address Lines should not end with '(', Alignment should match open parenthesis, Blank lines after open brace, Unnecessary parentheses across the affected files.

Reproduce locally:

./scripts/checkpatch.pl --strict --summary-file --ignore FILE_PATH_CHANGES \
  --git 4e0b0df1c84f9b81089bf478c63d635cc97040a1..aed3b565ad5154aada44404cf1b8bfff8b6d065b

❌ dtb-check

Root cause: Pre-existing monaco.dtsi usb-typec@47 interrupts_property warning surfaced as "new" because the PR's addition of 41 lines to monaco.dtsi shifted the line number from :7430 (base) to :7471 (head), defeating the grep -vFf base-subtraction.

Failure details:

Log Summary: Test failed
../arch/arm64/boot/dts/qcom/monaco.dtsi:7471.4-27: Warning (interrupts_property):
  /soc@0/geniqup@9c0000/i2c@980000/usb-typec@47:#interrupt-cells: size is (8),
  expected multiple of 12

The same warning appeared in the base log at line :7430 — the PR did not introduce this defect. The checker's grep -vFf base_dtbs_errors.log head_dtbs_errors.log comparison is line-number-sensitive, so the shifted line number causes a false positive.

This is a known recurring tree-wide issue (monaco.dtsi usb-typec@47 interrupts_property) — the #interrupt-cells mismatch pre-dates this PR.

Fix options:

  • No patch change needed — this is a false positive caused by line-number shift. Re-trigger CI after the pre-existing monaco.dtsi usb-typec@47 issue is fixed in the tree, or accept this as a known tree-wide issue.
  • If the tree owner wants to fix it: correct #interrupt-cells in the usb-typec@47 node in monaco.dtsi to be a multiple of 12 (i.e., #interrupt-cells = <3> for a 3-cell GIC interrupt specifier).

Reproduce locally:

make -j$(nproc) O=out CHECK_DTBS=y arch/arm64/boot/dts/qcom/monaco-evk.dtb

⏭️ qcom-next-check

All 24 commits carry FROMLIST: prefix. Network access was unavailable to fetch qcom-linux/qcom-next for verification.

Manual verification:

git remote add qcom-linux https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git
git fetch qcom-linux qcom-next
git log qcom-linux/qcom-next --oneline --grep="ACPI/AEST: Parse the AEST table"
# Repeat for each commit subject

Reviewer action: verify these patches are on a clear path to mainline (lore thread: https://lore.kernel.org/lkml/20260505-aest-devicetree-support-v1-8-d5d6ffacf0a5@oss.qualcomm.com/) before merging into tech/bsp/soc-infra.


Verdict

2 blockers to fix before merge:

  1. checkpatch ERROR (hard blocker): drivers/acpi/arm64/aest.c:206pointer 'res' with __free attribute should be initialized. Fix by initializing res = NULL at declaration. Also address the 3× kzalloc_obj warnings and the binded typo in the commit message.

  2. dtb-check FAIL (false positive / tree-wide issue): The monaco.dtsi usb-typec@47 interrupts_property warning is pre-existing and surfaced only due to line-number shift from the PR's 41-line addition. No patch change is required — re-trigger CI or accept as a known tree issue. The pre-existing defect in monaco.dtsi should be tracked separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants