[PATCH] drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled

James Zhu jamesz at amd.com
Mon Dec 13 13:39:02 UTC 2021


Hi Curry, Evan,

It seems vcn1.0 power gate sequence are special.

if the original solution is thread safe, then the following solution 
will be thread safe also.

static int vcn_v1_0_hw_fini(void *handle)
{
     struct amdgpu_device *adev = (struct amdgpu_device *)handle;

     cancel_delayed_work_sync(&adev->vcn.idle_work);

     if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) ||
         (adev->vcn.cur_state != AMD_PG_STATE_GATE &&
          RREG32_SOC15(VCN, 0, mmUVD_STATUS))) {
+        if (adev->pm.dpm_enabled)
+            amdgpu_dpm_enable_uvd(adev, false);
+        else
+            vcn_v1_0_set_powergating_state(adev, AMD_PG_STATE_GATE);
     }

Best Regards!

James

On 2021-12-13 3:55 a.m., Gong, Curry wrote:
>
> [AMD Official Use Only]
>
>
> Hi James:
>
> With the following patch, an error will be reported when the driver is 
> loaded
>
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>
> @@ -1202,6 +1204,9 @@ static int vcn_v1_0_stop(struct amdgpu_device *adev)
>
> else
>
> r = vcn_v1_0_stop_spg_mode(adev);
>
> c
>
> return r;
>
> }
>
> $ dmesg
>
> [ 363.181081] INFO: task kworker/3:2:223 blocked for more than 120 
> seconds.
>
> [ 363.181150]       Tainted: G           OE 5.11.0-41-generic 
> #45~20.04.1-Ubuntu
>
> [ 363.181208] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> disables this message.
>
> [ 363.181266] task:kworker/3:2     state:D stack:    0 pid: 223 
> ppid:     2 flags:0x00004000
>
> [ 363.181276] Workqueue: events vcn_v1_0_idle_work_handler [amdgpu]
>
> [ 363.181612] Call Trace:
>
> [ 363.181618]  __schedule+0x44c/0x8a0
>
> [ 363.181627]  schedule+0x4f/0xc0
>
> [ 363.181631]  schedule_preempt_disabled+0xe/0x10
>
> [ 363.181636]  __mutex_lock.isra.0+0x183/0x4d0
>
> [ 363.181643]  __mutex_lock_slowpath+0x13/0x20
>
> [ 363.181648]  mutex_lock+0x32/0x40
>
> [ 363.181652]  amdgpu_dpm_set_powergating_by_smu+0x9c/0x180 [amdgpu]
>
> [ 363.182055]  amdgpu_dpm_enable_uvd+0x38/0x110 [amdgpu]
>
> [ 363.182454]  vcn_v1_0_set_powergating_state+0x2e7e/0x3cf0 [amdgpu]
>
> [ 363.182776] amdgpu_device_ip_set_powergating_state+0x6c/0xc0 [amdgpu]
>
> [ 363.183028]  smu10_powergate_vcn+0x2a/0x80 [amdgpu]
>
> [ 363.183361]  pp_set_powergating_by_smu+0xc5/0x2b0 [amdgpu]
>
> [ 363.183699]  amdgpu_dpm_set_powergating_by_smu+0xb6/0x180 [amdgpu]
>
> [ 363.184040]  amdgpu_dpm_enable_uvd+0x38/0x110 [amdgpu]
>
> [ 363.184391]  vcn_v1_0_idle_work_handler+0xe1/0x130 [amdgpu]
>
> [ 363.184667]  process_one_work+0x220/0x3c0
>
> [ 363.184674]  worker_thread+0x4d/0x3f0
>
> [ 363.184677]  ? process_one_work+0x3c0/0x3c0
>
> [ 363.184680]  kthread+0x12b/0x150
>
> [ 363.184685]  ? set_kthread_struct+0x40/0x40
>
> [ 363.184690]  ret_from_fork+0x22/0x30
>
> [ 363.184699] INFO: task kworker/2:2:233 blocked for more than 120 
> seconds.
>
> [ 363.184739]       Tainted: G           OE 5.11.0-41-generic 
> #45~20.04.1-Ubuntu
>
> [ 363.184782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> disables this message.
>
> [ 363.184825] task:kworker/2:2     state:D stack:    0 pid: 233 
> ppid:     2 flags:0x00004000
>
> [ 363.184831] Workqueue: events 
> amdgpu_device_delayed_init_work_handler [amdgpu]
>
> [ 363.185085] Call Trace:
>
> [ 363.185087]  __schedule+0x44c/0x8a0
>
> [ 363.185092]  schedule+0x4f/0xc0
>
> [ 363.185095]  schedule_timeout+0x202/0x290
>
> [ 363.185099]  ? sched_clock_cpu+0x11/0xb0
>
> [ 363.185105]  wait_for_completion+0x94/0x100
>
> [ 363.185110]  __flush_work+0x12a/0x1e0
>
> [ 363.185113]  ? worker_detach_from_pool+0xc0/0xc0
>
> [ 363.185119]  __cancel_work_timer+0x10e/0x190
>
> [ 363.185123]  cancel_delayed_work_sync+0x13/0x20
>
> [ 363.185126]  vcn_v1_0_ring_begin_use+0x20/0x70 [amdgpu]
>
> [ 363.185401]  amdgpu_ring_alloc+0x48/0x60 [amdgpu]
>
> [ 363.185640]  amdgpu_ib_schedule+0x493/0x600 [amdgpu]
>
> [ 363.185884]  amdgpu_job_submit_direct+0x3c/0xd0 [amdgpu]
>
> [ 363.186186]  amdgpu_vcn_dec_send_msg+0x105/0x210 [amdgpu]
>
> [ 363.186460]  amdgpu_vcn_dec_ring_test_ib+0x69/0x110 [amdgpu]
>
> [ 363.186734]  amdgpu_ib_ring_tests+0xf5/0x160 [amdgpu]
>
> [ 363.186978] amdgpu_device_delayed_init_work_handler+0x15/0x30 [amdgpu]
>
> [ 363.187206]  process_one_work+0x220/0x3c0
>
> [ 363.187210]  worker_thread+0x4d/0x3f0
>
> [ 363.187214]  ? process_one_work+0x3c0/0x3c0
>
> [ 363.187217]  kthread+0x12b/0x150
>
> [ 363.187221]  ? set_kthread_struct+0x40/0x40
>
> [ 363.187226]  ret_from_fork+0x22/0x30
>
> BR
>
> Curry Gong
>
> *From:* Zhu, James <James.Zhu at amd.com>
> *Sent:* Saturday, December 11, 2021 5:07 AM
> *To:* Gong, Curry <Curry.Gong at amd.com>; amd-gfx at lists.freedesktop.org
> *Cc:* Liu, Leo <Leo.Liu at amd.com>; Zhu, James <James.Zhu at amd.com>; 
> Quan, Evan <Evan.Quan at amd.com>; Deucher, Alexander 
> <Alexander.Deucher at amd.com>
> *Subject:* Re: [PATCH] drm/amdgpu: When the VCN(1.0) block is 
> suspended, powergating is explicitly enabled
>
> On 2021-12-10 6:41 a.m., chen gong wrote:
>
>     Play a video on the raven (or PCO, raven2) platform, and then do the S3
>
>     test. When resume, the following error will be reported:
>
>     amdgpu 0000:02:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring
>
>     vcn_dec test failed (-110)
>
>     [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block
>
>     <vcn_v1_0> failed -110
>
>     amdgpu 0000:02:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
>
>     PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110
>
>     [why]
>
>     When playing the video: The power state flag of the vcn block is set to
>
>     POWER_STATE_ON.
>
>     When doing suspend: There is no change to the power state flag of the
>
>     vcn block, it is still POWER_STATE_ON.
>
>     When doing resume: Need to open the power gate of the vcn block and set
>
>     the power state flag of the VCN block to POWER_STATE_ON.
>
>     But at this time, the power state flag of the vcn block is already
>
>     POWER_STATE_ON. The power status flag check in the "8f2cdef drm/amd/pm:
>
>     avoid duplicate powergate/ungate setting" patch will return the
>
>     amdgpu_dpm_set_powergating_by_smu function directly.
>
>     As a result, the gate of the power was not opened, causing the
>
>     subsequent ring test to fail.
>
>     [how]
>
>     In the suspend function of the vcn block, explicitly change the power
>
>     state flag of the vcn block to POWER_STATE_OFF.
>
>     Signed-off-by: chen gong<curry.gong at amd.com>  <mailto:curry.gong at amd.com>
>
>     ---
>
>       drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 7 +++++++
>
>       1 file changed, 7 insertions(+)
>
>     diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>
>     index d54d720..d73676b 100644
>
>     --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>
>     +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>
>     @@ -246,6 +246,13 @@ static int vcn_v1_0_suspend(void *handle)
>
>       {
>
>        int r;
>
>        struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
>     + bool cancel_success;
>
>     +
>
>     + cancel_success = cancel_delayed_work_sync(&adev->vcn.idle_work);
>
> [JZ] Can you refer to vcn_v3_0_stop , and add 
> amdgpu_dpm_enable_uvd(adev, false); to the end of vcn_v1_0_stop?
>
> See if it also can help.
>
>     + if (cancel_success) {
>
>     +        if (adev->pm.dpm_enabled)
>
>     +                amdgpu_dpm_enable_uvd(adev, false);
>
>     + }
>
>       
>
>        r = vcn_v1_0_hw_fini(adev);
>
>        if (r)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20211213/3fd040fe/attachment-0001.htm>


More information about the amd-gfx mailing list