[PATCH v2 0/3] Prevent set of DCEFCLK on smu_v11 gpus

Feng, Kenneth Kenneth.Feng at amd.com
Fri Apr 23 05:24:16 UTC 2021


[AMD Official Use Only - Internal Distribution Only]

Series are Reviewed-by: Kenneth Feng <kenneth.feng at amd.com>


-----Original Message-----
From: Powell, Darren <Darren.Powell at amd.com> 
Sent: Friday, April 23, 2021 11:23 AM
To: amd-gfx at lists.freedesktop.org
Cc: Powell, Darren <Darren.Powell at amd.com>
Subject: [PATCH v2 0/3] Prevent set of DCEFCLK on smu_v11 gpus

=== Description ===
Set of simple patches to prevent attempts to set dcefclk on NAVI10

=== Test System ===
* DESKTOP(AMD FX-8350 + NAVI10(731F/ca), BIOS: F2)  + ISO(Ubuntu 20.04.1 LTS)  + Kernel(5.11.0-custom-amdinternal-dirty)

=== Patch Summary ===
   linux: (git at gitlab.freedesktop.org:agd5f) origin/amd-staging-drm-next @ b54280b32ebb
    + 599f1ebb60cc  amdgpu/pm: add extra info to SMU msg pre-check failed message
    + 291dcf836f45  amdgpu/pm: Prevent force of DCEFCLK on NAVI10 and SIENNA_CICHLID
    + c8ce10fc1d99  amdgpu/pm: set pp_dpm_dcefclk to readonly on NAVI10 and newer gpus

=== Tests ===
General Test Sequence
---------------------
* monitor dmesg output in a shell
  dmesg -w

* launch a root shell
  sudo bash

* set control to manual
  cd /sys/class/drm/card0/device
  echo manual > power_dpm_force_performance_level

* next step is expected to crash the GPU in unpatched and with patch 0001
** system usually continues operation so you can reboot gracefully

* TEST 1: modify pp_dpm_dcefclk to each level (0,1,2) and read setting after each write
  echo "1" > pp_dpm_dcefclk ; sleep 2 ; echo " ---set 1---" ; cat  pp_dpm_dcefclk ;\
  echo "2" > pp_dpm_dcefclk ; sleep 2 ; echo " ---set 2---" ; cat  pp_dpm_dcefclk ;\
  echo "0" > pp_dpm_dcefclk ; sleep 2 ; echo " ---set 0---" ; cat  pp_dpm_dcefclk
** example output
  [   74.493190] amdgpu 0000:03:00.0: amdgpu: failed send message: SetSoftMaxByFreq (27)  param: 0x000504f2 response 0xff
  [   76.497102] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed and SMU may be not in the right state!
  [   76.497114] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics table!
  [   76.497649] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed and SMU may be not in the right state!
  [   78.501229] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed and SMU may be not in the right state!
  [   78.501241] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics table!
  [   78.501766] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed and SMU may be not in the right state!
  [   80.505401] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed and SMU may be not in the right state!
  [   80.505414] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics table!

* TEST 2:
   ls -la  /sys/class/drm/card0/device/pp_dpm_dcefclk
** example output
  -rw-r--r-- 1 root root 4096 Apr  7 18:33 /sys/class/drm/card0/device/pp_dpm_dcefclk

* POST TEST
** restore dpm clock to auto
  echo auto > power_dpm_force_performance_level


Test Results
------------
* 0001 amdgpu/pm: add extra info to SMU msg pre-check failed message
** TEST 1 dmesg output
  [  101.414826] amdgpu 0000:03:00.0: amdgpu: failed send message: SetSoftMaxByFreq (27) 	param: 0x000504f2 response 0xff
  [  103.418916] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state!
  [  103.418930] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics table!
  [  103.419474] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state!
  [  105.423226] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state!
  [  105.423239] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics table!
  [  105.423649] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state!
  [  107.427502] amdgpu 0000:03:00.0: amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state!
  [  107.427516] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics table!

* 0002  amdgpu/pm: Prevent force of DCEFCLK on NAVI10 and SIENNA_CICHLID
** GPU remains operational after test
** TEST 1 dmesg output
  [  263.087136] amdgpu 0000:03:00.0: amdgpu: Setting DCEFCLK min/max dpm level is not supported!
  [  265.092026] amdgpu 0000:03:00.0: amdgpu: Setting DCEFCLK min/max dpm level is not supported!
  [  267.096648] amdgpu 0000:03:00.0: amdgpu: Setting DCEFCLK min/max dpm level is not supported!

* 0003  amdgpu/pm: set pp_dpm_dcefclk to readonly on smu_v11 gpus
** TEST 2 shell output
  bash: pp_dpm_dcefclk: Permission denied
   ---set 1---
  0: 506Mhz *
  1: 886Mhz 
  2: 1266Mhz 
  bash: pp_dpm_dcefclk: Permission denied
   ---set 2---
  0: 506Mhz *
  1: 886Mhz 
  2: 1266Mhz 
  bash: pp_dpm_dcefclk: Permission denied
   ---set 0---
  0: 506Mhz *
  1: 886Mhz 
  2: 1266Mhz 

Darren Powell (3):
  amdgpu/pm: add extra info to SMU msg pre-check failed message
  amdgpu/pm: Prevent force of DCEFCLK on NAVI10 and SIENNA_CICHLID
  amdgpu/pm: set pp_dpm_dcefclk to readonly on NAVI10 and newer gpus

 drivers/gpu/drm/amd/pm/amdgpu_pm.c                      | 8 ++++++++
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c         | 5 ++++-
 drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 4 +++-
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c                  | 4 ++--
 4 files changed, 17 insertions(+), 4 deletions(-)


base-commit: b54280b32ebb9381e045e645eabd99dbbe607ec2
-- 
2.25.1


More information about the amd-gfx mailing list