[PATCH v2 1/2] drm/amdgpu: enhance amdgpu_vcn_suspend

Mon May 17 17:43:02 UTC 2021

On 2021-05-17 12:54 p.m., James Zhu wrote:
> I am wondering if there are still some jobs kept in the queue, it is 
> lucky to check

Yes it's possible, in this case delayed handler is set, so cancelling 
once is enough.

>
> UVD_POWER_STATUS done, but after, fw start a new job that list in the 
> queue.
>
> To handle this situation perfectly, we need add mechanism to suspend 
> fw first.

I think that should be handled by the sequence from 
vcn_v3_0_stop_dpg_mode().

>
> Another case, if it is unlucky, that  vcn fw hung at that time, 
> UVD_POWER_STATUS
>
> always keeps busy.   then it needs force powering gate the vcn hw 
> after certain time waiting.

Yep, we still need to gate VCN power after certain timeout.

Regards,

Leo

>
> Best Regards!
>
> James
>
> On 2021-05-17 12:34 p.m., Leo Liu wrote:
>>
>> On 2021-05-17 11:52 a.m., James Zhu wrote:
>>> During vcn suspends, stop ring continue to receive new requests,
>>> and try to wait for all vcn jobs to finish gracefully.
>>>
>>> v2: Forced powering gate vcn hardware after few wainting retry.
>>>
>>> Signed-off-by: James Zhu <James.Zhu at amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 22 +++++++++++++++++++++-
>>>   1 file changed, 21 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> index 2016459..9f3a6e7 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> @@ -275,9 +275,29 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
>>>   {
>>>       unsigned size;
>>>       void *ptr;
>>> +    int retry_max = 6;
>>>       int i;
>>>   -    cancel_delayed_work_sync(&adev->vcn.idle_work);
>>> +    for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>> +        if (adev->vcn.harvest_config & (1 << i))
>>> +            continue;
>>> +        ring = &adev->vcn.inst[i].ring_dec;
>>> +        ring->sched.ready = false;
>>> +
>>> +        for (j = 0; j < adev->vcn.num_enc_rings; ++j) {
>>> +            ring = &adev->vcn.inst[i].ring_enc[j];
>>> +            ring->sched.ready = false;
>>> +        }
>>> +    }
>>> +
>>> +    while (retry_max-- && 
>>> cancel_delayed_work_sync(&adev->vcn.idle_work))
>>> +        mdelay(5);
>>
>> I think it's possible to have one pending job unprocessed with VCN 
>> when suspend sequence getting here, but it shouldn't be more than 
>> one, cancel_delayed_work_sync probably return false after the first 
>> time, so calling cancel_delayed_work_sync once should be enough here. 
>> we probably need to wait longer from:
>>
>> SOC15_WAIT_ON_RREG(VCN, inst_idx, mmUVD_POWER_STATUS, 1,
>>         UVD_POWER_STATUS__UVD_POWER_STATUS_MASK);
>>
>> to make sure the unprocessed job get done.
>>
>>
>> Regards,
>>
>> Leo
>>
>>
>>> +    if (!retry_max && !amdgpu_sriov_vf(adev)) {
>>> +        if (RREG32_SOC15(VCN, i, mmUVD_STATUS)) {
>>> +            dev_warn(adev->dev, "Forced powering gate vcn hardware!");
>>> +            vcn_v3_0_set_powergating_state(adev, AMD_PG_STATE_GATE);
>>> +        }
>>> +    }
>>>         for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>           if (adev->vcn.harvest_config & (1 << i))