Skip to content

arm64: dts: rockchip: Add dynamic-power-coefficient to rk3399 GPU#532

Open
LZhaoM wants to merge 1 commit intoradxa:linux-6.1-stan-rkr5.1from
LZhaoM:rk3399-gpu-cooling
Open

arm64: dts: rockchip: Add dynamic-power-coefficient to rk3399 GPU#532
LZhaoM wants to merge 1 commit intoradxa:linux-6.1-stan-rkr5.1from
LZhaoM:rk3399-gpu-cooling

Conversation

@LZhaoM
Copy link
Copy Markdown

@LZhaoM LZhaoM commented Mar 27, 2026

arm64: dts: rockchip: Add dynamic-power-coefficient to rk3399 GPU

On a Radxa Rock 4B+ board, when executing the following stress test
command for a few hours:
glmark2-es2 -b refract:duration=60 -s 3840x2160 --off-screen --run-forever

The SoC temperature will increase gradually until it reaches the
critical temperature and triggers shutdown.
Kernel log:
kernel: thermal thermal_zone1: gpu-thermal: critical temperature reached, shutting down
kernel: reboot: HARDWARE PROTECTION shutdown (Temperature too high)

The reason is that the device tree gpu node does not provide a
dynamic-power-coefficient; without this value, the gpu cooling device is
not registered[0][1], so it will reach the critical temperature.

The value 2640 is based on this upstream patch:
https://lore.kernel.org/all/20231127081511.1911706-1-lukasz.luba@arm.com

[0]: https://github.com/radxa/kernel/blob/8582469f117fdfd5d1ab88fa7e4e15c3b714bf24/drivers/thermal/devfreq_cooling.c#L484-L489
[1]: https://github.com/radxa/kernel/blob/8582469f117fdfd5d1ab88fa7e4e15c3b714bf24/drivers/opp/of.c#L1594-L1606

On a Radxa Rock 4B+ board, when executing the following stress test
command for a few hours:
glmark2-es2 -b refract:duration=60 -s 3840x2160 --off-screen --run-forever

The SoC temperature will increase gradually until it reaches the
critical temperature and triggers shutdown.
Kernel log:
kernel: thermal thermal_zone1: gpu-thermal: critical temperature reached, shutting down
kernel: reboot: HARDWARE PROTECTION shutdown (Temperature too high)

The reason is that the device tree gpu node does not provide a
dynamic-power-coefficient; without this value, the gpu cooling device is
not registered[0][1], so it will reach the critical temperature.

The value 2640 is based on this upstream patch:
https://lore.kernel.org/all/20231127081511.1911706-1-lukasz.luba@arm.com

[0]: https://github.com/radxa/kernel/blob/8582469f117fdfd5d1ab88fa7e4e15c3b714bf24/drivers/thermal/devfreq_cooling.c#L484-L489
[1]: https://github.com/radxa/kernel/blob/8582469f117fdfd5d1ab88fa7e4e15c3b714bf24/drivers/opp/of.c#L1594-L1606
Copy link
Copy Markdown
Member

@RadxaYuntian RadxaYuntian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上游来的改动还是一般尽量cherrypick过来,保留原始提交信息

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants