[PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.
Christian König
ckoenig.leichtzumerken at gmail.com
Mon Apr 29 19:03:52 UTC 2019
I would clean them up further, but that's only moving code around so
feel free to add my rb to those.
Christian.
Am 29.04.19 um 16:14 schrieb Grodzovsky, Andrey:
>
> Thanks David, with that only patches 5 and 6 are left for the series
> to be reviewed.
>
> Christian, any more comments on those patches ?
>
> Andrey
>
> On 4/27/19 10:56 PM, Zhou, David(ChunMing) wrote:
>>
>> Sorry, I only can put my Acked-by: Chunming Zhou
>> <david1.zhou at amd.com> on patch#3.
>>
>> I cannot fully judge patch #4, #5, #6.
>>
>> -David
>>
>> *From:*amd-gfx <amd-gfx-bounces at lists.freedesktop.org> *On Behalf Of
>> *Grodzovsky, Andrey
>> *Sent:* Friday, April 26, 2019 10:09 PM
>> *To:* Koenig, Christian <Christian.Koenig at amd.com>; Zhou,
>> David(ChunMing) <David1.Zhou at amd.com>;
>> dri-devel at lists.freedesktop.org; amd-gfx at lists.freedesktop.org;
>> eric at anholt.net; etnaviv at lists.freedesktop.org
>> *Cc:* Kazlauskas, Nicholas <Nicholas.Kazlauskas at amd.com>; Liu, Monk
>> <Monk.Liu at amd.com>
>> *Subject:* Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty
>> job already signaled.
>>
>> Ping (mostly David and Monk).
>>
>> Andrey
>>
>> On 4/24/19 3:09 AM, Christian König wrote:
>>
>> Am 24.04.19 um 05:02 schrieb Zhou, David(ChunMing):
>>
>> >> - drm_sched_stop(&ring->sched, &job->base);
>> >> -
>> >> /* after all hw jobs are reset, hw fence is
>> meaningless, so force_completion */
>> >> amdgpu_fence_driver_force_completion(ring);
>> >> }
>>
>> HW fence are already forced completion, then we can just
>> disable irq fence process and ignore hw fence signal when we
>> are trying to do GPU reset, I think. Otherwise which will
>> make the logic much more complex.
>>
>> If this situation happens because of long time execution, we
>> can increase timeout of reset detection.
>>
>>
>> You are not thinking widely enough, forcing the hw fence to
>> complete can trigger other to start other activity in the system.
>>
>> We first need to stop everything and make sure that we don't do
>> any processing any more and then start with our reset procedure
>> including forcing all hw fences to complete.
>>
>> Christian.
>>
>>
>> -David
>>
>> *From:*amd-gfx <amd-gfx-bounces at lists.freedesktop.org>
>> <mailto:amd-gfx-bounces at lists.freedesktop.org> *On Behalf Of
>> *Grodzovsky, Andrey
>> *Sent:* Wednesday, April 24, 2019 12:00 AM
>> *To:* Zhou, David(ChunMing) <David1.Zhou at amd.com>
>> <mailto:David1.Zhou at amd.com>; dri-devel at lists.freedesktop.org
>> <mailto:dri-devel at lists.freedesktop.org>;
>> amd-gfx at lists.freedesktop.org
>> <mailto:amd-gfx at lists.freedesktop.org>; eric at anholt.net
>> <mailto:eric at anholt.net>; etnaviv at lists.freedesktop.org
>> <mailto:etnaviv at lists.freedesktop.org>;
>> ckoenig.leichtzumerken at gmail.com
>> <mailto:ckoenig.leichtzumerken at gmail.com>
>> *Cc:* Kazlauskas, Nicholas <Nicholas.Kazlauskas at amd.com>
>> <mailto:Nicholas.Kazlauskas at amd.com>; Liu, Monk
>> <Monk.Liu at amd.com> <mailto:Monk.Liu at amd.com>
>> *Subject:* Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if
>> guilty job already signaled.
>>
>> No, i mean the actual HW fence which signals when the job
>> finished execution on the HW.
>>
>> Andrey
>>
>> On 4/23/19 11:19 AM, Zhou, David(ChunMing) wrote:
>>
>> do you mean fence timer? why not stop it as well when
>> stopping sched for the reason of hw reset?
>>
>> -------- Original Message --------
>> Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if
>> guilty job already signaled.
>> From: "Grodzovsky, Andrey"
>> To: "Zhou, David(ChunMing)"
>> ,dri-devel at lists.freedesktop.org,amd-gfx at lists.freedesktop.org,eric at anholt.net,etnaviv at lists.freedesktop.org,ckoenig.leichtzumerken at gmail.com
>> <mailto:dri-devel at lists.freedesktop.org,amd-gfx at lists.freedesktop.org,eric at anholt.net,etnaviv at lists.freedesktop.org,ckoenig.leichtzumerken at gmail.com>
>> CC: "Kazlauskas, Nicholas" ,"Liu, Monk"
>>
>>
>> On 4/22/19 9:09 AM, Zhou, David(ChunMing) wrote:
>> > +Monk.
>> >
>> > GPU reset is used widely in SRIOV, so need
>> virtulizatino guy take a look.
>> >
>> > But out of curious, why guilty job can signal more if
>> the job is already
>> > set to guilty? set it wrongly?
>> >
>> >
>> > -David
>>
>>
>> It's possible that the job does completes at a later time
>> then it's
>> timeout handler started processing so in this patch we
>> try to protect
>> against this by rechecking the HW fence after stopping
>> all SW
>> schedulers. We do it BEFORE marking guilty on the job's
>> sched_entity so
>> at the point we check the guilty flag is not set yet.
>>
>> Andrey
>>
>>
>> >
>> > 在 2019/4/18 23:00, Andrey Grodzovsky 写道:
>> >> Also reject TDRs if another one already running.
>> >>
>> >> v2:
>> >> Stop all schedulers across device and entire XGMI hive
>> before
>> >> force signaling HW fences.
>> >> Avoid passing job_signaled to helper fnctions to keep
>> all the decision
>> >> making about skipping HW reset in one place.
>> >>
>> >> v3:
>> >> Fix SW sched. hang after non HW reset.
>> sched.hw_rq_count has to be balanced
>> >> against it's decrement in drm_sched_stop in non HW
>> reset case.
>> >> v4: rebase
>> >> v5: Revert v3 as we do it now in sceduler code.
>> >>
>> >> Signed-off-by: Andrey Grodzovsky
>> <andrey.grodzovsky at amd.com>
>> <mailto:andrey.grodzovsky at amd.com>
>> >> ---
>> >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 143
>> +++++++++++++++++++----------
>> >> 1 file changed, 95 insertions(+), 48 deletions(-)
>> >>
>> >> diff --git
>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> >> index a0e165c..85f8792 100644
>> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> >> @@ -3334,8 +3334,6 @@ static int
>> amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
>> >> if (!ring || !ring->sched.thread)
>> >> continue;
>> >>
>> >> - drm_sched_stop(&ring->sched, &job->base);
>> >> -
>> >> /* after all hw jobs are reset, hw fence
>> is meaningless, so force_completion */
>> >> amdgpu_fence_driver_force_completion(ring);
>> >> }
>> >> @@ -3343,6 +3341,7 @@ static int
>> amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
>> >> if(job)
>> >> drm_sched_increase_karma(&job->base);
>> >>
>> >> + /* Don't suspend on bare metal if we are not
>> going to HW reset the ASIC */
>> >> if (!amdgpu_sriov_vf(adev)) {
>> >>
>> >> if (!need_full_reset)
>> >> @@ -3480,37 +3479,21 @@ static int
>> amdgpu_do_asic_reset(struct amdgpu_hive_info *hive,
>> >> return r;
>> >> }
>> >>
>> >> -static void amdgpu_device_post_asic_reset(struct
>> amdgpu_device *adev)
>> >> +static bool amdgpu_device_lock_adev(struct
>> amdgpu_device *adev, bool trylock)
>> >> {
>> >> - int i;
>> >> -
>> >> - for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>> >> - struct amdgpu_ring *ring = adev->rings[i];
>> >> -
>> >> - if (!ring || !ring->sched.thread)
>> >> - continue;
>> >> -
>> >> - if (!adev->asic_reset_res)
>> >> - drm_sched_resubmit_jobs(&ring->sched);
>> >> + if (trylock) {
>> >> + if (!mutex_trylock(&adev->lock_reset))
>> >> + return false;
>> >> + } else
>> >> + mutex_lock(&adev->lock_reset);
>> >>
>> >> - drm_sched_start(&ring->sched, !adev->asic_reset_res);
>> >> - }
>> >> -
>> >> - if (!amdgpu_device_has_dc_support(adev)) {
>> >> - drm_helper_resume_force_mode(adev->ddev);
>> >> - }
>> >> -
>> >> - adev->asic_reset_res = 0;
>> >> -}
>> >> -
>> >> -static void amdgpu_device_lock_adev(struct
>> amdgpu_device *adev)
>> >> -{
>> >> - mutex_lock(&adev->lock_reset);
>> >> atomic_inc(&adev->gpu_reset_counter);
>> >> adev->in_gpu_reset = 1;
>> >> /* Block kfd: SRIOV would do it separately */
>> >> if (!amdgpu_sriov_vf(adev))
>> >> amdgpu_amdkfd_pre_reset(adev);
>> >> +
>> >> + return true;
>> >> }
>> >>
>> >> static void amdgpu_device_unlock_adev(struct
>> amdgpu_device *adev)
>> >> @@ -3538,40 +3521,42 @@ static void
>> amdgpu_device_unlock_adev(struct amdgpu_device *adev)
>> >> int amdgpu_device_gpu_recover(struct amdgpu_device
>> *adev,
>> >> struct amdgpu_job *job)
>> >> {
>> >> - int r;
>> >> + struct list_head device_list, *device_list_handle
>> = NULL;
>> >> + bool need_full_reset, job_signaled;
>> >> struct amdgpu_hive_info *hive = NULL;
>> >> - bool need_full_reset = false;
>> >> struct amdgpu_device *tmp_adev = NULL;
>> >> - struct list_head device_list, *device_list_handle
>> = NULL;
>> >> + int i, r = 0;
>> >>
>> >> + need_full_reset = job_signaled = false;
>> >> INIT_LIST_HEAD(&device_list);
>> >>
>> >> dev_info(adev->dev, "GPU reset begin!\n");
>> >>
>> >> + hive = amdgpu_get_xgmi_hive(adev, false);
>> >> +
>> >> /*
>> >> - * In case of XGMI hive disallow concurrent
>> resets to be triggered
>> >> - * by different nodes. No point also since the
>> one node already executing
>> >> - * reset will also reset all the other nodes in
>> the hive.
>> >> + * Here we trylock to avoid chain of resets
>> executing from
>> >> + * either trigger by jobs on different adevs in
>> XGMI hive or jobs on
>> >> + * different schedulers for same device while
>> this TO handler is running.
>> >> + * We always reset all schedulers for device and
>> all devices for XGMI
>> >> + * hive so that should take care of them too.
>> >> */
>> >> - hive = amdgpu_get_xgmi_hive(adev, 0);
>> >> - if (hive && adev->gmc.xgmi.num_physical_nodes > 1 &&
>> >> - !mutex_trylock(&hive->reset_lock))
>> >> +
>> >> + if (hive && !mutex_trylock(&hive->reset_lock)) {
>> >> + DRM_INFO("Bailing on TDR for s_job:%llx,
>> hive: %llx as another already in progress",
>> >> + job->base.id, hive->hive_id);
>> >> return 0;
>> >> + }
>> >>
>> >> /* Start with adev pre asic reset first for soft
>> reset check.*/
>> >> - amdgpu_device_lock_adev(adev);
>> >> - r = amdgpu_device_pre_asic_reset(adev,
>> >> - job,
>> >> - &need_full_reset);
>> >> - if (r) {
>> >> - /*TODO Should we stop ?*/
>> >> - DRM_ERROR("GPU pre asic reset failed with
>> err, %d for drm dev, %s ",
>> >> - r, adev->ddev->unique);
>> >> - adev->asic_reset_res = r;
>> >> + if (!amdgpu_device_lock_adev(adev, !hive)) {
>> >> + DRM_INFO("Bailing on TDR for s_job:%llx,
>> as another already in progress",
>> >> + job->base.id);
>> >> + return 0;
>> >> }
>> >>
>> >> /* Build list of devices to reset */
>> >> - if (need_full_reset &&
>> adev->gmc.xgmi.num_physical_nodes > 1) {
>> >> + if (adev->gmc.xgmi.num_physical_nodes > 1) {
>> >> if (!hive) {
>> >> amdgpu_device_unlock_adev(adev);
>> >> return -ENODEV;
>> >> @@ -3588,13 +3573,56 @@ int
>> amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>> >> device_list_handle = &device_list;
>> >> }
>> >>
>> >> + /* block all schedulers and reset given job's ring */
>> >> + list_for_each_entry(tmp_adev, device_list_handle,
>> gmc.xgmi.head) {
>> >> + for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>> >> + struct amdgpu_ring *ring =
>> tmp_adev->rings[i];
>> >> +
>> >> + if (!ring || !ring->sched.thread)
>> >> + continue;
>> >> +
>> >> + drm_sched_stop(&ring->sched, &job->base);
>> >> + }
>> >> + }
>> >> +
>> >> +
>> >> + /*
>> >> + * Must check guilty signal here since after this
>> point all old
>> >> + * HW fences are force signaled.
>> >> + *
>> >> + * job->base holds a reference to parent fence
>> >> + */
>> >> + if (job && job->base.s_fence->parent &&
>> >> + dma_fence_is_signaled(job->base.s_fence->parent))
>> >> + job_signaled = true;
>> >> +
>> >> + if (!amdgpu_device_ip_need_full_reset(adev))
>> >> + device_list_handle = &device_list;
>> >> +
>> >> + if (job_signaled) {
>> >> + dev_info(adev->dev, "Guilty job already
>> signaled, skipping HW reset");
>> >> + goto skip_hw_reset;
>> >> + }
>> >> +
>> >> +
>> >> + /* Guilty job will be freed after this*/
>> >> + r = amdgpu_device_pre_asic_reset(adev,
>> >> + job,
>> >> + &need_full_reset);
>> >> + if (r) {
>> >> + /*TODO Should we stop ?*/
>> >> + DRM_ERROR("GPU pre asic reset failed with
>> err, %d for drm dev, %s ",
>> >> + r, adev->ddev->unique);
>> >> + adev->asic_reset_res = r;
>> >> + }
>> >> +
>> >> retry: /* Rest of adevs pre asic reset from XGMI
>> hive. */
>> >> list_for_each_entry(tmp_adev,
>> device_list_handle, gmc.xgmi.head) {
>> >>
>> >> if (tmp_adev == adev)
>> >> continue;
>> >>
>> >> - amdgpu_device_lock_adev(tmp_adev);
>> >> + amdgpu_device_lock_adev(tmp_adev, false);
>> >> r = amdgpu_device_pre_asic_reset(tmp_adev,
>> >> NULL,
>> >> &need_full_reset);
>> >> @@ -3618,9 +3646,28 @@ int
>> amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>> >> goto retry;
>> >> }
>> >>
>> >> +skip_hw_reset:
>> >> +
>> >> /* Post ASIC reset for all devs .*/
>> >> list_for_each_entry(tmp_adev,
>> device_list_handle, gmc.xgmi.head) {
>> >> - amdgpu_device_post_asic_reset(tmp_adev);
>> >> + for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>> >> + struct amdgpu_ring *ring =
>> tmp_adev->rings[i];
>> >> +
>> >> + if (!ring || !ring->sched.thread)
>> >> + continue;
>> >> +
>> >> + /* No point to resubmit jobs if
>> we didn't HW reset*/
>> >> + if (!tmp_adev->asic_reset_res &&
>> !job_signaled)
>> >> + drm_sched_resubmit_jobs(&ring->sched);
>> >> +
>> >> + drm_sched_start(&ring->sched,
>> !tmp_adev->asic_reset_res);
>> >> + }
>> >> +
>> >> + if
>> (!amdgpu_device_has_dc_support(tmp_adev) && !job_signaled) {
>> >> + drm_helper_resume_force_mode(tmp_adev->ddev);
>> >> + }
>> >> +
>> >> + tmp_adev->asic_reset_res = 0;
>> >>
>> >> if (r) {
>> >> /* bad news, how to tell it to
>> userspace ? */
>> >> @@ -3633,7 +3680,7 @@ int
>> amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>> >> amdgpu_device_unlock_adev(tmp_adev);
>> >> }
>> >>
>> >> - if (hive && adev->gmc.xgmi.num_physical_nodes > 1)
>> >> + if (hive)
>> >> mutex_unlock(&hive->reset_lock);
>> >>
>> >> if (r)
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> <mailto:amd-gfx at lists.freedesktop.org>
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190429/a1807f1c/attachment-0001.html>
More information about the dri-devel
mailing list