[PATCH 3/4] drm/amdgpu: Avoid HW reset if guilty job already signaled.

Mon Apr 15 15:15:49 UTC 2019

On 4/15/19 2:46 AM, Koenig, Christian wrote:

I agree this would be good in case of amdgpu_device_pre_asic_reset
because we can totally skip this function if guilty job already
signaled, but for amdgpu_device_post_asic_reset it crates complications
because drm_sched_start is right in the middle there after
drm_sched_resubmit_jobs but before forcing back set mode in display so I
prefer to keep passing 'job_signaled' to amdgpu_device_post_asic_reset.
Alternative is to get rid of this function and bring it's body into
amdgpu_device_gpu_recover which is already pretty cluttered and
confusing.  What do you think ?

Yeah, that's what I meant with that this still looks rather unorganized.

How about splitting up amdgpu_device_post_asic_reset into more
functions? Would that work out as well?

I looked into it and seems the simplest way is just to move this function body into the main reset function since it's pretty short function and there is internal loop across all rings inside.

I resent the patches and also amended your lost display  patch 'wait for fence without holding reservation lock'.

Andrey

I think what we should do is to keep amdgpu_device_gpu_recover() the top
level control logic, e.g. which step is called on which device in which
order.

We should not push that decision into the individual steps, because that
would make things even more confusing.

Regards,
Christian.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190415/0fb27226/attachment-0001.html>