[PATCH 5/5] drm/amdgpu: Refactor GPU reset for XGMI hive case.
Grodzovsky, Andrey
Andrey.Grodzovsky at amd.com
Thu Nov 22 19:56:50 UTC 2018
On 11/22/2018 02:03 PM, Christian König wrote:
> Am 22.11.18 um 16:44 schrieb Grodzovsky, Andrey:
>>
>> On 11/22/2018 06:16 AM, Christian König wrote:
>>> How about using a lock per hive and then acquiring that with trylock()
>>> instead?
>>>
>>> This way you should at least catch cases where multiple causes try to
>>> reset the same hive at the same time. True that there is still some
>>> racing involved, but it's at least a good start.
>> What about using per hive work_struct which when scheduled will execute
>> amdgpu_device_gpu_recover ? Since work queue rejects duplicates we will
>> get per hive serialization automatically from that without need of using
>> any new mutex. Also this might be necessary anyway for RAS as I believe
>> RAS will trigger interrupt when errors are detected and might then
>> decide to reset the GPU so we will have to switch to button half context
>> anyway. And I don't think it's a problem to schedule this work from job
>> timeout handler if needed.
>
> Mhm, what is the advantage to the trylock variant? The key problem
> with that is how to get the offending job into the background worker.
>
> E.g. we have trouble guaranteeing that the job isn't destroyed when
> this is in a background worker.
Yea, actually there isn't.
>
>>>
>>>
>>> Additional to that I would try improve the pre, middle, post handling
>>> towards checking if we made some progress in between.
>>>
>>> In other words we stop all schedulers in the pre handling and
>>> disconnect the scheduler fences from the hardware fence like I did in
>>> patch "drm/sched: fix timeout handling v2".
>>>
>>> Then before we do the actual reset in the middle handling we check if
>>> the offending job has completed or at least made some progress in the
>>> meantime.
>> I understand how to check if the job completed - if it's fence already
>> signaled, but how do I test if the job made 'at least some progress' ?
>
> Good question. Maybe we can somehow query from the hardware the number
> of primitives or pixels processed so far and then compare after a moment?
I will check on this later. In the mean while I will update the code
with the proposed per hive locking and I will add the check if the
guilty job completed before ASIC reset skipping the reset if it's did.
Andrey
>
>> Another question - what's the purpose of this progress check - if I've
>> already completed the pre handling sequence I can't bail out even if
>> the guilty job is is signaled by the time I do the progress check, I
>> have to complete at least the post handling to. Do you mean I can at
>> least skip the ASIC reset phase in that case ?
>
> Yes exactly. We have a rather big race problem in the current reset
> logic between completing the job and resetting the hardware.
>
> In other words when the job completes exactly in the moment we reset
> the GPU we try to signal it twice etc etc... The whole handling here
> is not really thought through.
>
> Christian.
>
>>
>> Andrey
>>
>>> In the case of a manual reset we skip that because we don't have an
>>> offending job to check.
>>>
>>> In the post handling we stitch everything together again and start the
>>> scheduler to go on with job submission.
>>>
>>> Christian.
>>>
>>> Am 21.11.18 um 23:02 schrieb Grodzovsky, Andrey:
>>>> Depends what was the reason for triggering the reset for that node how
>>>> do we know ?
>>>> If the reason was RAS error that probably not hard to check all errors
>>>> are cleared, but
>>>> if the reason was job timeout on that specific node I will need to
>>>> recheck that no jobs are left in incomplete state
>>>> state. And if the reason is manual gpu reset trigger from sysfs, then
>>>> what's the policy ?
>>>> Sounds to me it's just easier to go ahead and allow all the pending
>>>> resets to proceed unless there is
>>>> a clear and quick criteria you can check after you grab the mutex then
>>>> sure - but I don't know what it would be.
>>>>
>>>> Andrey
>>>>
>>>> On 11/21/2018 03:49 PM, Liu, Shaoyun wrote:
>>>>> I saw you use the global xgmi_mutex to prevent concurrent reset
>>>>> to be
>>>>> triggered by different nodes , but after the mutex been released ,
>>>>> current node may grap the mutex and continue to do another reset .
>>>>> Maybe we should check the GPU status and skip the reset in this
>>>>> case
>>>>> since the GPU may already be in good state .
>>>>>
>>>>> Regards
>>>>>
>>>>> shaoyun.liu
>>>>>
>>>>> On 2018-11-21 1:10 p.m., Andrey Grodzovsky wrote:
>>>>>> For XGMI hive case do reset in steps where each step iterates over
>>>>>> all devs in hive. This especially important for asic reset
>>>>>> since all PSP FW in hive must come up within a limited time
>>>>>> (around 1 sec) to properply negotiate the link.
>>>>>> Do this by refactoring amdgpu_device_gpu_recover and
>>>>>> amdgpu_device_reset
>>>>>> into pre_asic_reset, asic_reset and post_asic_reset functions where
>>>>>> is part
>>>>>> is exectued for all the GPUs in the hive before going to the next
>>>>>> step.
>>>>>>
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
>>>>>> ---
>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 5 +-
>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 375
>>>>>> ++++++++++++++++++++---------
>>>>>> 2 files changed, 264 insertions(+), 116 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> index 4ef5f7a..bd06d45 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> @@ -1026,6 +1026,9 @@ struct amdgpu_device {
>>>>>> unsigned long last_mm_index;
>>>>>> bool in_gpu_reset;
>>>>>> struct mutex lock_reset;
>>>>>> +
>>>>>> + int asic_reset_res;
>>>>>> + int resched;
>>>>>> };
>>>>>> static inline struct amdgpu_device *amdgpu_ttm_adev(struct
>>>>>> ttm_bo_device *bdev)
>>>>>> @@ -1232,7 +1235,7 @@ struct amdgpu_hive_info;
>>>>>> struct list_head *amdgpu_xgmi_get_adev_list_handle(struct
>>>>>> amdgpu_hive_info *hive);
>>>>>> struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct
>>>>>> amdgpu_device *adev);
>>>>>> -int amdgpu_xgmi_update_topology(struct amdgpu_hive_info *hive);
>>>>>> +int amdgpu_xgmi_update_topology(struct amdgpu_hive_info *hive,
>>>>>> struct amdgpu_device *adev);
>>>>>> int amdgpu_xgmi_add_device(struct amdgpu_device *adev);
>>>>>> /*
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index cb06e68..8e94d7f 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -3157,86 +3157,6 @@ static int amdgpu_device_recover_vram(struct
>>>>>> amdgpu_device *adev)
>>>>>> return 0;
>>>>>> }
>>>>>> -/**
>>>>>> - * amdgpu_device_reset - reset ASIC/GPU for bare-metal or
>>>>>> passthrough
>>>>>> - *
>>>>>> - * @adev: amdgpu device pointer
>>>>>> - *
>>>>>> - * attempt to do soft-reset or full-reset and reinitialize Asic
>>>>>> - * return 0 means succeeded otherwise failed
>>>>>> - */
>>>>>> -static int amdgpu_device_reset(struct amdgpu_device *adev)
>>>>>> -{
>>>>>> - bool need_full_reset, vram_lost = 0;
>>>>>> - int r;
>>>>>> -
>>>>>> - need_full_reset = amdgpu_device_ip_need_full_reset(adev);
>>>>>> -
>>>>>> - if (!need_full_reset) {
>>>>>> - amdgpu_device_ip_pre_soft_reset(adev);
>>>>>> - r = amdgpu_device_ip_soft_reset(adev);
>>>>>> - amdgpu_device_ip_post_soft_reset(adev);
>>>>>> - if (r || amdgpu_device_ip_check_soft_reset(adev)) {
>>>>>> - DRM_INFO("soft reset failed, will fallback to full
>>>>>> reset!\n");
>>>>>> - need_full_reset = true;
>>>>>> - }
>>>>>> - }
>>>>>> -
>>>>>> - if (need_full_reset) {
>>>>>> - r = amdgpu_device_ip_suspend(adev);
>>>>>> -
>>>>>> -retry:
>>>>>> - r = amdgpu_asic_reset(adev);
>>>>>> - /* post card */
>>>>>> - amdgpu_atom_asic_init(adev->mode_info.atom_context);
>>>>>> -
>>>>>> - if (!r) {
>>>>>> - dev_info(adev->dev, "GPU reset succeeded, trying to
>>>>>> resume\n");
>>>>>> - r = amdgpu_device_ip_resume_phase1(adev);
>>>>>> - if (r)
>>>>>> - goto out;
>>>>>> -
>>>>>> - vram_lost = amdgpu_device_check_vram_lost(adev);
>>>>>> - if (vram_lost) {
>>>>>> - DRM_ERROR("VRAM is lost!\n");
>>>>>> - atomic_inc(&adev->vram_lost_counter);
>>>>>> - }
>>>>>> -
>>>>>> - r = amdgpu_gtt_mgr_recover(
>>>>>> - &adev->mman.bdev.man[TTM_PL_TT]);
>>>>>> - if (r)
>>>>>> - goto out;
>>>>>> -
>>>>>> - r = amdgpu_device_fw_loading(adev);
>>>>>> - if (r)
>>>>>> - return r;
>>>>>> -
>>>>>> - r = amdgpu_device_ip_resume_phase2(adev);
>>>>>> - if (r)
>>>>>> - goto out;
>>>>>> -
>>>>>> - if (vram_lost)
>>>>>> - amdgpu_device_fill_reset_magic(adev);
>>>>>> - }
>>>>>> - }
>>>>>> -
>>>>>> -out:
>>>>>> - if (!r) {
>>>>>> - amdgpu_irq_gpu_reset_resume_helper(adev);
>>>>>> - r = amdgpu_ib_ring_tests(adev);
>>>>>> - if (r) {
>>>>>> - dev_err(adev->dev, "ib ring test failed (%d).\n", r);
>>>>>> - r = amdgpu_device_ip_suspend(adev);
>>>>>> - need_full_reset = true;
>>>>>> - goto retry;
>>>>>> - }
>>>>>> - }
>>>>>> -
>>>>>> - if (!r)
>>>>>> - r = amdgpu_device_recover_vram(adev);
>>>>>> -
>>>>>> - return r;
>>>>>> -}
>>>>>> /**
>>>>>> * amdgpu_device_reset_sriov - reset ASIC for SR-IOV vf
>>>>>> @@ -3335,31 +3255,16 @@ bool
>>>>>> amdgpu_device_should_recover_gpu(struct amdgpu_device *adev)
>>>>>> return false;
>>>>>> }
>>>>>> -/**
>>>>>> - * amdgpu_device_gpu_recover - reset the asic and recover scheduler
>>>>>> - *
>>>>>> - * @adev: amdgpu device pointer
>>>>>> - * @job: which job trigger hang
>>>>>> - *
>>>>>> - * Attempt to reset the GPU if it has hung (all asics).
>>>>>> - * Returns 0 for success or an error on failure.
>>>>>> - */
>>>>>> -int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>>>>>> - struct amdgpu_job *job)
>>>>>> -{
>>>>>> - int i, r, resched;
>>>>>> -
>>>>>> - dev_info(adev->dev, "GPU reset begin!\n");
>>>>>> -
>>>>>> - mutex_lock(&adev->lock_reset);
>>>>>> - atomic_inc(&adev->gpu_reset_counter);
>>>>>> - adev->in_gpu_reset = 1;
>>>>>> - /* Block kfd */
>>>>>> - amdgpu_amdkfd_pre_reset(adev);
>>>>>> +static int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
>>>>>> + struct amdgpu_job *job,
>>>>>> + bool *need_full_reset_arg)
>>>>>> +{
>>>>>> + int i, r = 0;
>>>>>> + bool need_full_reset = *need_full_reset_arg;
>>>>>> /* block TTM */
>>>>>> - resched = ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
>>>>>> + adev->resched =
>>>>>> ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
>>>>>> /* block all schedulers and reset given job's ring */
>>>>>> for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>>>>> @@ -3379,10 +3284,121 @@ int amdgpu_device_gpu_recover(struct
>>>>>> amdgpu_device *adev,
>>>>>> amdgpu_fence_driver_force_completion(ring);
>>>>>> }
>>>>>> - if (amdgpu_sriov_vf(adev))
>>>>>> - r = amdgpu_device_reset_sriov(adev, job ? false : true);
>>>>>> - else
>>>>>> - r = amdgpu_device_reset(adev);
>>>>>> + if (!amdgpu_sriov_vf(adev)) {
>>>>>> +
>>>>>> + if (!need_full_reset)
>>>>>> + need_full_reset =
>>>>>> amdgpu_device_ip_need_full_reset(adev);
>>>>>> +
>>>>>> + if (!need_full_reset) {
>>>>>> + amdgpu_device_ip_pre_soft_reset(adev);
>>>>>> + r = amdgpu_device_ip_soft_reset(adev);
>>>>>> + amdgpu_device_ip_post_soft_reset(adev);
>>>>>> + if (r || amdgpu_device_ip_check_soft_reset(adev)) {
>>>>>> + DRM_INFO("soft reset failed, will fallback to full
>>>>>> reset!\n");
>>>>>> + need_full_reset = true;
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + if (need_full_reset)
>>>>>> + r = amdgpu_device_ip_suspend(adev);
>>>>>> +
>>>>>> + *need_full_reset_arg = need_full_reset;
>>>>>> + }
>>>>>> +
>>>>>> + return r;
>>>>>> +}
>>>>>> +
>>>>>> +static int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive,
>>>>>> + struct list_head *device_list_handle,
>>>>>> + bool *need_full_reset_arg)
>>>>>> +{
>>>>>> + struct amdgpu_device *tmp_adev = NULL;
>>>>>> + bool need_full_reset = *need_full_reset_arg, vram_lost = false;
>>>>>> + int r = 0;
>>>>>> +
>>>>>> + /*
>>>>>> + * ASIC reset has to be done on all HGMI hive nodes ASAP
>>>>>> + * to allow proper links negotiation in FW (within 1 sec)
>>>>>> + */
>>>>>> + if (need_full_reset) {
>>>>>> + list_for_each_entry(tmp_adev, device_list_handle,
>>>>>> gmc.xgmi.head) {
>>>>>> + r = amdgpu_asic_reset(tmp_adev);
>>>>>> + if (r)
>>>>>> + DRM_WARN("ASIC reset failed with err r, %d for drm
>>>>>> dev, %s",
>>>>>> + r, tmp_adev->ddev->unique);
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> +
>>>>>> + list_for_each_entry(tmp_adev, device_list_handle,
>>>>>> gmc.xgmi.head) {
>>>>>> + if (need_full_reset) {
>>>>>> + /* post card */
>>>>>> + if
>>>>>> (amdgpu_atom_asic_init(tmp_adev->mode_info.atom_context))
>>>>>> + DRM_WARN("asic atom init failed!");
>>>>>> +
>>>>>> + if (!r) {
>>>>>> + dev_info(tmp_adev->dev, "GPU reset succeeded,
>>>>>> trying to resume\n");
>>>>>> + r = amdgpu_device_ip_resume_phase1(tmp_adev);
>>>>>> + if (r)
>>>>>> + goto out;
>>>>>> +
>>>>>> + vram_lost =
>>>>>> amdgpu_device_check_vram_lost(tmp_adev);
>>>>>> + if (vram_lost) {
>>>>>> + DRM_ERROR("VRAM is lost!\n");
>>>>>> + atomic_inc(&tmp_adev->vram_lost_counter);
>>>>>> + }
>>>>>> +
>>>>>> + r = amdgpu_gtt_mgr_recover(
>>>>>> + &tmp_adev->mman.bdev.man[TTM_PL_TT]);
>>>>>> + if (r)
>>>>>> + goto out;
>>>>>> +
>>>>>> + r = amdgpu_device_fw_loading(tmp_adev);
>>>>>> + if (r)
>>>>>> + return r;
>>>>>> +
>>>>>> + r = amdgpu_device_ip_resume_phase2(tmp_adev);
>>>>>> + if (r)
>>>>>> + goto out;
>>>>>> +
>>>>>> + if (vram_lost)
>>>>>> + amdgpu_device_fill_reset_magic(tmp_adev);
>>>>>> +
>>>>>> + /* Update PSP FW topology after reset */
>>>>>> + if (tmp_adev->gmc.xgmi.num_physical_nodes > 1)
>>>>>> + r = amdgpu_xgmi_update_topology(hive,
>>>>>> tmp_adev);
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> +
>>>>>> +out:
>>>>>> + if (!r) {
>>>>>> + amdgpu_irq_gpu_reset_resume_helper(tmp_adev);
>>>>>> + r = amdgpu_ib_ring_tests(tmp_adev);
>>>>>> + if (r) {
>>>>>> + dev_err(tmp_adev->dev, "ib ring test failed
>>>>>> (%d).\n", r);
>>>>>> + r = amdgpu_device_ip_suspend(tmp_adev);
>>>>>> + need_full_reset = true;
>>>>>> + r = -EAGAIN;
>>>>>> + goto end;
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + if (!r)
>>>>>> + r = amdgpu_device_recover_vram(tmp_adev);
>>>>>> + else
>>>>>> + tmp_adev->asic_reset_res = r;
>>>>>> + }
>>>>>> +
>>>>>> +end:
>>>>>> + *need_full_reset_arg = need_full_reset;
>>>>>> + return r;
>>>>>> +}
>>>>>> +
>>>>>> +static void amdgpu_device_post_asic_reset(struct amdgpu_device
>>>>>> *adev,
>>>>>> + struct amdgpu_job *job)
>>>>>> +{
>>>>>> + int i;
>>>>>> for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>>>>> struct amdgpu_ring *ring = adev->rings[i];
>>>>>> @@ -3394,7 +3410,7 @@ int amdgpu_device_gpu_recover(struct
>>>>>> amdgpu_device *adev,
>>>>>> * or all rings (in the case @job is NULL)
>>>>>> * after above amdgpu_reset accomplished
>>>>>> */
>>>>>> - if ((!job || job->base.sched == &ring->sched) && !r)
>>>>>> + if ((!job || job->base.sched == &ring->sched) &&
>>>>>> !adev->asic_reset_res)
>>>>>> drm_sched_job_recovery(&ring->sched);
>>>>>> kthread_unpark(ring->sched.thread);
>>>>>> @@ -3404,21 +3420,150 @@ int amdgpu_device_gpu_recover(struct
>>>>>> amdgpu_device *adev,
>>>>>> drm_helper_resume_force_mode(adev->ddev);
>>>>>> }
>>>>>> - ttm_bo_unlock_delayed_workqueue(&adev->mman.bdev, resched);
>>>>>> + ttm_bo_unlock_delayed_workqueue(&adev->mman.bdev, adev->resched);
>>>>>> - if (r) {
>>>>>> - /* bad news, how to tell it to userspace ? */
>>>>>> - dev_info(adev->dev, "GPU reset(%d) failed\n",
>>>>>> atomic_read(&adev->gpu_reset_counter));
>>>>>> - amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_GPU_RESET_FAIL,
>>>>>> 0, r);
>>>>>> - } else {
>>>>>> - dev_info(adev->dev, "GPU reset(%d)
>>>>>> succeeded!\n",atomic_read(&adev->gpu_reset_counter));
>>>>>> - }
>>>>>> + adev->asic_reset_res = adev->resched = 0;
>>>>>> +
>>>>>> +}
>>>>>> +static void amdgpu_lock_adev(struct amdgpu_device *adev)
>>>>>> +{
>>>>>> + mutex_lock(&adev->lock_reset);
>>>>>> + atomic_inc(&adev->gpu_reset_counter);
>>>>>> + adev->in_gpu_reset = 1;
>>>>>> + /* Block kfd */
>>>>>> + amdgpu_amdkfd_pre_reset(adev);
>>>>>> +}
>>>>>> +
>>>>>> +static void amdgpu_unlock_adev(struct amdgpu_device *adev)
>>>>>> +{
>>>>>> /*unlock kfd */
>>>>>> amdgpu_amdkfd_post_reset(adev);
>>>>>> amdgpu_vf_error_trans_all(adev);
>>>>>> adev->in_gpu_reset = 0;
>>>>>> mutex_unlock(&adev->lock_reset);
>>>>>> +}
>>>>>> +
>>>>>> +
>>>>>> +/**
>>>>>> + * amdgpu_device_gpu_recover - reset the asic and recover scheduler
>>>>>> + *
>>>>>> + * @adev: amdgpu device pointer
>>>>>> + * @job: which job trigger hang
>>>>>> + *
>>>>>> + * Attempt to reset the GPU if it has hung (all asics).
>>>>>> + * Attempt to do soft-reset or full-reset and reinitialize Asic
>>>>>> + * Returns 0 for success or an error on failure.
>>>>>> + */
>>>>>> +
>>>>>> +int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>>>>>> + struct amdgpu_job *job)
>>>>>> +{
>>>>>> + int r;
>>>>>> + struct amdgpu_hive_info *hive = NULL;
>>>>>> + bool need_full_reset = false;
>>>>>> + struct amdgpu_device *tmp_adev = NULL;
>>>>>> + struct list_head device_list, *device_list_handle = NULL;
>>>>>> +
>>>>>> + INIT_LIST_HEAD(&device_list);
>>>>>> +
>>>>>> + dev_info(adev->dev, "GPU reset begin!\n");
>>>>>> +
>>>>>> + /*
>>>>>> + * In case of XGMI hive disallow concurrent resets to be
>>>>>> triggered
>>>>>> + * by different nodes.
>>>>>> + */
>>>>>> + if (adev->gmc.xgmi.num_physical_nodes > 1)
>>>>>> + mutex_lock(&xgmi_mutex);
>>>>>> +
>>>>>> + /* Start with adev pre asic reset first for soft reset check.*/
>>>>>> + amdgpu_lock_adev(adev);
>>>>>> + r = amdgpu_device_pre_asic_reset(adev,
>>>>>> + job,
>>>>>> + &need_full_reset);
>>>>>> + if (r) {
>>>>>> + /*TODO Should we stop ?*/
>>>>>> + DRM_ERROR("GPU pre asic reset failed with err, %d for drm
>>>>>> dev, %s ",
>>>>>> + r, adev->ddev->unique);
>>>>>> + adev->asic_reset_res = r;
>>>>>> + }
>>>>>> +
>>>>>> + /* Build list of devices to reset */
>>>>>> + if (need_full_reset && adev->gmc.xgmi.num_physical_nodes >
>>>>>> 1) {
>>>>>> + hive = amdgpu_get_xgmi_hive(adev);
>>>>>> + if (!hive) {
>>>>>> + r = -ENODEV;
>>>>>> +
>>>>>> + amdgpu_unlock_adev(adev);
>>>>>> +
>>>>>> + if (adev->gmc.xgmi.num_physical_nodes > 1)
>>>>>> + mutex_unlock(&xgmi_mutex);
>>>>>> + return r;
>>>>>> + }
>>>>>> +
>>>>>> + /*
>>>>>> + * In case we are in XGMI hive mode device reset is done
>>>>>> for all the
>>>>>> + * nodes in the hive to retrain all XGMI links and hence
>>>>>> the reset
>>>>>> + * sequence is executed in loop on all nodes.
>>>>>> + */
>>>>>> + device_list_handle =
>>>>>> amdgpu_xgmi_get_adev_list_handle(hive);
>>>>>> + } else {
>>>>>> + list_add_tail(&adev->gmc.xgmi.head, &device_list);
>>>>>> + device_list_handle = &device_list;
>>>>>> + }
>>>>>> +
>>>>>> +retry: /* Rest of adevs pre asic reset from XGMI hive. */
>>>>>> + list_for_each_entry(tmp_adev, device_list_handle,
>>>>>> gmc.xgmi.head) {
>>>>>> +
>>>>>> + if (tmp_adev == adev)
>>>>>> + continue;
>>>>>> +
>>>>>> + dev_info(tmp_adev->dev, "GPU reset begin for drm dev
>>>>>> %s!\n", adev->ddev->unique);
>>>>>> +
>>>>>> + amdgpu_lock_adev(tmp_adev);
>>>>>> + r = amdgpu_device_pre_asic_reset(tmp_adev,
>>>>>> + NULL,
>>>>>> + &need_full_reset);
>>>>>> + /*TODO Should we stop ?*/
>>>>>> + if (r) {
>>>>>> + DRM_ERROR("GPU pre asic reset failed with err, %d for
>>>>>> drm dev, %s ",
>>>>>> + r, tmp_adev->ddev->unique);
>>>>>> + tmp_adev->asic_reset_res = r;
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + /* Actual ASIC resets if needed.*/
>>>>>> + /* TODO Implement XGMI hive reset logic for SRIOV */
>>>>>> + if (amdgpu_sriov_vf(adev)) {
>>>>>> + r = amdgpu_device_reset_sriov(adev, job ? false : true);
>>>>>> + if (r)
>>>>>> + adev->asic_reset_res = r;
>>>>>> + } else {
>>>>>> + r = amdgpu_do_asic_reset(hive, device_list_handle,
>>>>>> &need_full_reset);
>>>>>> + if (r && r == -EAGAIN)
>>>>>> + goto retry;
>>>>>> + }
>>>>>> +
>>>>>> + /* Post ASIC reset for all devs .*/
>>>>>> + list_for_each_entry(tmp_adev, device_list_handle,
>>>>>> gmc.xgmi.head) {
>>>>>> + amdgpu_device_post_asic_reset(tmp_adev, tmp_adev == adev ?
>>>>>> job : NULL);
>>>>>> +
>>>>>> + if (r) {
>>>>>> + /* bad news, how to tell it to userspace ? */
>>>>>> + dev_info(tmp_adev->dev, "GPU reset(%d) failed\n",
>>>>>> atomic_read(&adev->gpu_reset_counter));
>>>>>> + amdgpu_vf_error_put(tmp_adev,
>>>>>> AMDGIM_ERROR_VF_GPU_RESET_FAIL, 0, r);
>>>>>> + } else {
>>>>>> + dev_info(tmp_adev->dev, "GPU reset(%d) succeeded!\n",
>>>>>> atomic_read(&adev->gpu_reset_counter));
>>>>>> + }
>>>>>> +
>>>>>> + amdgpu_unlock_adev(tmp_adev);
>>>>>> + }
>>>>>> +
>>>>>> + if (adev->gmc.xgmi.num_physical_nodes > 1)
>>>>>> + mutex_unlock(&xgmi_mutex);
>>>>>> +
>>>>>> + if (r)
>>>>>> + dev_info(adev->dev, "GPU reset end with ret = %d\n", r);
>>>>>> return r;
>>>>>> }
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
More information about the amd-gfx
mailing list