[PATCH] drm/amdgpu: Always call kfd post reset after reset
Felix Kuehling
felix.kuehling at amd.com
Tue Apr 24 18:23:19 UTC 2018
On 2018-04-24 05:48 AM, Oded Gabbay wrote:
> On Wed, Apr 11, 2018 at 11:19 PM, Felix Kuehling <felix.kuehling at amd.com> wrote:
>> On 2018-04-11 03:47 PM, Shaoyun Liu wrote:
>>> Even reset failed, kfd post reset need to be called to make lock balance on
>>> kfd side
>>>
>>> Change-Id: I8b6ef29d7527915611be0b96a9cd039bc75bb0a9
>>> Signed-off-by: Shaoyun Liu <Shaoyun.Liu at amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++----
>>> 1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 78b7d39..90a37ed 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -3231,12 +3231,11 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>>> /* bad news, how to tell it to userspace ? */
>>> dev_info(adev->dev, "GPU reset(%d) failed\n", atomic_read(&adev->gpu_reset_counter));
>>> amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_GPU_RESET_FAIL, 0, r);
>>> - } else {
>>> + } else
>>> dev_info(adev->dev, "GPU reset(%d) successed!\n",atomic_read(&adev->gpu_reset_counter));
>>> - /*unlock kfd after a successfully recovery*/
>>> - amdgpu_amdkfd_post_reset(adev);
>>> - }
>> Please leave the braces {...}. It's better style to make all branches of
>> the same if-else-if-...-else use the same braces (or no-braces). With
>> that fixed, this change is Reviewed-by: Felix Kuehling
>> <Felix.Kuehling at amd.com>
>>
>>> + /*unlock kfd */
>>> + amdgpu_amdkfd_post_reset(adev);
>>> amdgpu_vf_error_trans_all(adev);
>>> adev->in_gpu_reset = 0;
>>> mutex_unlock(&adev->lock_reset);
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> I didn't find a function called "amdgpu_amdkfd_post_reset" anywhere in
> the code.
> Maybe this patch for something internal, or is it for the Vega code
> that I haven't yet taken ?
Yeah, this is related to some on-going work to support GPU
hang-detection and reset. As more of our driver goes upstream, more
changes will be reviewed here. Right now it's still hit and miss and
people aren't sure what changes to review where.
Regards,
Felix
>
> Oded
More information about the amd-gfx
mailing list