[PATCH] drm/amdgpu/vcn1: add cancel_delayed_work_sync before power gate

James Zhu jamesz at amd.com
Tue May 18 17:04:54 UTC 2021


On 2021-05-18 12:36 p.m., Christian König wrote:
> Am 18.05.21 um 17:59 schrieb James Zhu:
>>
>> On 2021-05-18 11:54 a.m., Christian König wrote:
>>>
>>>
>>> Am 18.05.21 um 17:45 schrieb James Zhu:
>>>>
>>>> On 2021-05-18 11:23 a.m., Christian König wrote:
>>>>> Am 18.05.21 um 17:11 schrieb James Zhu:
>>>>>> Add cancel_delayed_work_sync before set power gating state
>>>>>> to avoid race condition issue when power gating.
>>>>>>
>>>>>> Signed-off-by: James Zhu <James.Zhu at amd.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 19 ++++++++++++++++++-
>>>>>>   1 file changed, 18 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
>>>>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>>>>>> index 0c1beef..6c5c083 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>>>>>> @@ -230,10 +230,27 @@ static int vcn_v1_0_hw_init(void *handle)
>>>>>>   static int vcn_v1_0_hw_fini(void *handle)
>>>>>>   {
>>>>>>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>>>> +    struct amdgpu_ring *ring;
>>>>>> +    int i;
>>>>>> +
>>>>>> +    ring = &adev->vcn.inst->ring_dec;
>>>>>> +    ring->sched.ready = false;
>>>>>> +
>>>>>> +    for (i = 0; i < adev->vcn.num_enc_rings; ++i) {
>>>>>> +        ring = &adev->vcn.inst->ring_enc[i];
>>>>>> +        ring->sched.ready = false;
>>>>>> +    }
>>>>>> +
>>>>>> +    ring = &adev->jpeg.inst->ring_dec;
>>>>>> +    ring->sched.ready = false;
>>>>>
>>>>> Thinking more about that this is a really big NAK. The scheduler 
>>>>> threads must to stay ready during a reset.
>>>>>
>>>>> This is controlled by the upper layer and shouldn't be messed with 
>>>>> in the hardware specific backend at all.
>>>>
>>>>> [JZ] I ported this from current vcn3 hw_fini. Just want to make 
>>>>> sure that no more new jobs will be scheduled after suspend process 
>>>>> starts.
>>>> It may a redundancy, since scheduler maybe already suspend. I can 
>>>> remove those if you are sure no side effect,
>>>
>>> Well, we *must* remove those. This flag controls if the hardware 
>>> engine can be used for command submission and is only be set to 
>>> true/false during initial driver load.
>>>
>>> If you change it to false during hw_fini the engine won't work 
>>> correctly any more after GPU reset or resume.
>> [JZ] If I recalled correctly tat hw_init will be called every time 
>> after GPU reset or suspend/resume,
>
> Yes that's correct.
>
> But before that and during GPU reset the ready flag is then false for 
> a short period of time which would result in userspace applications 
> crashing when they try to submit something.
[JZ]  Application should handle situation when submission failed without 
crash.Maybe driver should return -EAGAIN to ask application to submit 
job later when gpu is under reset/suspend-resume.
>
> The flag essentially says that userspace can submit jobs to the 
> scheduler. Processing of those jobs is of course only started after 
> the hardware is re-initialized, but pushing jobs down the pipe is 
> still perfectly valid in that situation.
[JZ] I am wondering if it is requested to stop scheduling new jobs 
before save bo.
>
> Christian.
>
>>>
>>> If you have any idea how to document that fact then please speak up, 
>>> cause we had this problem a couple of times now.
>>>
>>> Just send out a patch fixing various other occasions of that.
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>>> I've removed all of those a couple of years ago.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> +
>>>>>> + cancel_delayed_work_sync(&adev->vcn.idle_work);
>>>>>>         if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) ||
>>>>>> -        RREG32_SOC15(VCN, 0, mmUVD_STATUS))
>>>>>> +        (adev->vcn.cur_state != AMD_PG_STATE_GATE &&
>>>>>> +         RREG32_SOC15(VCN, 0, mmUVD_STATUS))) {
>>>>>>           vcn_v1_0_set_powergating_state(adev, AMD_PG_STATE_GATE);
>>>>>> +    }
>>>>>>         return 0;
>>>>>>   }
>>>>>
>>>
>


More information about the amd-gfx mailing list