[PATCH v2] drm/amd/amdgpu implement tdr advanced mode

Tue Mar 9 16:48:01 UTC 2021

I we are talking about 'PATCH v3] drm/amd/amdgpu implement tdr advanced 
mode'  which was sent yesterday then I already went over it and only had 
2 cosmetical comments.

Andrey

On 2021-03-09 6:16 a.m., Christian König wrote:
> Yeah, that are some really good points. I completely agree that we 
> shouldn't do any larger cleanup right now.
> 
> But I think we still need some more review on this. I most likely won't 
> have enough time to look into this before the weekend.
> 
> Andrey can you take a look as well?
> 
> Thanks,
> Christian.
> 
> Am 09.03.21 um 08:29 schrieb Liu, Monk:
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Christian
>>
>> what feasible and practice now is:
>> 1) we implement the advanced TDR mode in upstream first (so we can 
>> copy the same scheme in our LTS kernel) -- if you want we can avoid 
>> change drm/scheduler part code, but that one is already rejected by 
>> you due to complicated
>> 2) then we retire the mirror list concept and rework the drm/scheduler 
>> with KFIFO
>> 3) remove the guilty/karma handling from scheduler
>>
>> So I basically agree with you on the spirit of above changes: hide 
>> those AMD internal concept or tricks in vendor's driver part and make 
>> scheduler simple and scalable
>> But that definitely need  a longer design and discussion, so why don't 
>> we focus on our current problems now,
>> as long as the new change doesn't regress it is still a good change 
>> based on current TDR implements
>>
>> I would proposal we only change AMD part code in this time, Jack's 
>> first version patch didn't touch scheduler part, but you stated it was 
>> too complicated and rejected it
>>
>> So the allowable revise options is what Jack did in ver2, which need 
>> to introduce a new API in scheduler drm_sched_resubmit_jobs2().
>>
>> Hah: --( what do you think ?
>>
>> Thanks
>>
>> ------------------------------------------
>> Monk Liu | Cloud-GPU Core team
>> ------------------------------------------
>>
>> -----Original Message-----
>> From: Koenig, Christian <Christian.Koenig at amd.com>
>> Sent: Monday, March 8, 2021 3:53 PM
>> To: Liu, Monk <Monk.Liu at amd.com>; Zhang, Jack (Jian) 
>> <Jack.Zhang1 at amd.com>; amd-gfx at lists.freedesktop.org; Grodzovsky, 
>> Andrey <Andrey.Grodzovsky at amd.com>; Deng, Emily <Emily.Deng at amd.com>
>> Subject: Re: [PATCH v2] drm/amd/amdgpu implement tdr advanced mode
>>
>>
>>
>> Am 08.03.21 um 05:06 schrieb Liu, Monk:
>>> [AMD Official Use Only - Internal Distribution Only]
>>>
>>>>> well first of all please completely drop the affinity group stuff 
>>>>> from this patch. We should concentrate on one feature at at time.
>>> We need it to expedite the process, we can introduce this change in
>>> another patch
>>>
>>>
>>>>> Then the implementation is way to complicate. All you need to do is 
>>>>> insert a dma_fence_wait after re-scheduling each job after a reset.
>>> No that's not true, during the " drm_sched_resubmit_jobs" it will put
>>> all jobs in mirror list into the hw ring, but we can only allow the
>>> first job to the ring To catch the real guilty one (otherwise it is 
>>> possible that the later job in the ring also has bug and affect our 
>>> judgement) So we need to implement a new " 
>>> drm_sched_resubmit_jobs2()", like this way:
>> Something like this. But since waiting for the guilty job is AMD 
>> specific we should rather rework the stuff from the beginning.
>>
>> What I have in mind is the following:
>> 1. Add a reference from the scheduler fence back to the job which is 
>> cleared only when the scheduler fence finishes.
>> 2. Completely drop the ring_mirror_list and replace it with a kfifo of 
>> pointers to the active scheduler fences.
>> 3. Replace drm_sched_resubmit_jobs with a drm_sched_for_each_active() 
>> macro which allows drivers to iterate over all the active jobs and 
>> resubmit/wait/mark them as guilty etc etc..
>> 4. Remove the guilty/karma handling from the scheduler. This is 
>> something AMD specific and shouldn't leak into common code.
>>
>> Regards,
>> Christian.
>>
>>> drm_sched_resubmit_jobs2()
>>> ~ 499 void drm_sched_resubmit_jobs2(struct drm_gpu_scheduler *sched, 
>>> int max)
>>>     500 {
>>>     501     struct drm_sched_job *s_job, *tmp;
>>>     502     uint64_t guilty_context;
>>>     503     bool found_guilty = false;
>>>     504     struct dma_fence *fence;
>>> + 505     int i = 0;
>>>     506
>>>     507     list_for_each_entry_safe(s_job, tmp, 
>>> &sched->ring_mirror_list, node) {
>>>     508         struct drm_sched_fence *s_fence = s_job->s_fence;
>>>     509
>>> + 510         if (i >= max)
>>> + 511             break;
>>> + 512
>>>     513         if (!found_guilty && atomic_read(&s_job->karma) > 
>>> sched->hang_limit) {
>>>     514             found_guilty = true;
>>>     515             guilty_context = s_job->s_fence->scheduled.context;
>>>     516         }
>>>     517
>>>     518         if (found_guilty && s_job->s_fence->scheduled.context 
>>> == guilty_context)
>>>     519             dma_fence_set_error(&s_fence->finished, -ECANCELED);
>>>     520
>>>     521         dma_fence_put(s_job->s_fence->parent);
>>>     522         fence = sched->ops->run_job(s_job);
>>> + 523         i++;
>>>     524
>>>     525         if (IS_ERR_OR_NULL(fence)) {
>>>     526             if (IS_ERR(fence))
>>>     527                 dma_fence_set_error(&s_fence->finished, 
>>> PTR_ERR(fence));
>>>     528
>>>     529             s_job->s_fence->parent = NULL;
>>>     530         } else {
>>>     531             s_job->s_fence->parent = fence;
>>>     532         }
>>>     533
>>>     534
>>>     535     }
>>>     536 }
>>>     537 EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>>>     538
>>>
>>>
>>>
>>> Thanks
>>>
>>> ------------------------------------------
>>> Monk Liu | Cloud-GPU Core team
>>> ------------------------------------------
>>>
>>> -----Original Message-----
>>> From: Koenig, Christian <Christian.Koenig at amd.com>
>>> Sent: Sunday, March 7, 2021 3:03 AM
>>> To: Zhang, Jack (Jian) <Jack.Zhang1 at amd.com>;
>>> amd-gfx at lists.freedesktop.org; Grodzovsky, Andrey
>>> <Andrey.Grodzovsky at amd.com>; Liu, Monk <Monk.Liu at amd.com>; Deng, Emily
>>> <Emily.Deng at amd.com>
>>> Subject: Re: [PATCH v2] drm/amd/amdgpu implement tdr advanced mode
>>>
>>> Hi Jack,
>>>
>>> well first of all please completely drop the affinity group stuff 
>>> from this patch. We should concentrate on one feature at at time.
>>>
>>> Then the implementation is way to complicate. All you need to do is 
>>> insert a dma_fence_wait after re-scheduling each job after a reset.
>>>
>>> Additional to that this feature is completely AMD specific and 
>>> shouldn't affect the common scheduler in any way.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 06.03.21 um 18:25 schrieb Jack Zhang:
>>>> [Why]
>>>> Previous tdr design treats the first job in job_timeout as the bad job.
>>>> But sometimes a later bad compute job can block a good gfx job and
>>>> cause an unexpected gfx job timeout because gfx and compute ring
>>>> share internal GC HW mutually.
>>>>
>>>> [How]
>>>> This patch implements an advanced tdr mode.It involves an additinal
>>>> synchronous pre-resubmit step(Step0 Resubmit) before normal resubmit
>>>> step in order to find the real bad job.
>>>>
>>>> 1. For Bailing TDR job, re-insert it to mirror_list, don't set it to
>>>> guilty and leave it to be handled by the main reset thread.
>>>>
>>>> 2. Don't set the job to guilty in pre_asic_reset, and leave it to be
>>>> handled by Step0 Resubmit Stage.
>>>>
>>>> 3. At Step0 Resubmit stage, it first resubmit jobs asynchronously,
>>>> then it iterate each ring mirror_list, synchronously pend for each hw
>>>> fence being signaled. If the a job's hw fence get timeout, we
>>>> identify it as guilty and do hw reset to recover hw. After that, we
>>>> would do the normal resubmit step to resubmit left jobs.
>>>>
>>>> 4. For whole gpu reset(vram lost), skip Step0 Resubmit as each job
>>>> after vram lost was considered as bad job.
>>>>
>>>> 5. Involve the concept "Affinity Group".
>>>> Doing two hw resets is not necessary when there's only one ring that
>>>> has jobs among some hw-related rings.Thus, we involve "affinity group".
>>>> Hw-related rings could be added into a common affinity group, such as
>>>> gfx and compute ring. When tdr happens, we iterate all rings in
>>>> affinity group, skip Step0 Resubmit stage if there's only one ring's
>>>> mirror_list that has valid sched jobs.
>>>>
>>>> V2:
>>>>        -fix a cherry-pick mistake for bailing TDR handling.
>>>>
>>>>        -do affinity_group check according to the bad job's sched rather
>>>>         than the default "1" so that there could be multiple affinity
>>>>         groups being pre-defined in future.
>>>>
>>>> Signed-off-by: Jack Zhang <Jack.Zhang1 at amd.com>
>>>> ---
>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 102 
>>>> +++++++++++++++++++--
>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |   2 +-
>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_job.c    |  47 ++++++++++
>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_job.h    |   2 +-
>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   |  27 ++++++
>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |   1 +
>>>>     include/drm/gpu_scheduler.h                |   1 +
>>>>     7 files changed, 173 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index e247c3a2ec08..8632d7071292 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -4188,6 +4188,37 @@ bool amdgpu_device_has_job_running(struct 
>>>> amdgpu_device *adev)
>>>>         return false;
>>>>     }
>>>> +bool amdgpu_affinity_group_has_only_or_null_working_ring(struct
>>>> +amdgpu_device *adev, struct drm_sched_job *s_job) {
>>>> +       int i;
>>>> +       int working_ring_num = 0;
>>>> +
>>>> +    /*
>>>> +     * The job is considered as the real bad one
>>>> +     * if job's sched is not in affinity group
>>>> +     */
>>>> +    if (s_job->sched.affinity_group == 0)
>>>> +            return true;
>>>> +
>>>> +       for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>>> +               struct amdgpu_ring *ring = adev->rings[i];
>>>> +
>>>> +               if (!ring || !ring->sched.thread)
>>>> +                       continue;
>>>> +
>>>> +               /* for non-empty affinity ring, increase 
>>>> working_ring_num */
>>>> +               if (ring->sched.affinity_group == 
>>>> s_job->sched.affinity_group) {
>>>> +                       if (!list_empty(&ring->sched.ring_mirror_list))
>>>> +                               working_ring_num++;
>>>> +               }
>>>> +       }
>>>> +
>>>> +       if (working_ring_num > 1) {
>>>> +               return false;
>>>> +       }
>>>> +       return true;
>>>> +}
>>>> +
>>>>     /**
>>>>      * amdgpu_device_should_recover_gpu - check if we should try GPU 
>>>> recovery
>>>>      *
>>>> @@ -4310,8 +4341,10 @@ static int 
>>>> amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
>>>>             amdgpu_fence_driver_force_completion(ring);
>>>>         }
>>>> -    if(job)
>>>> -        drm_sched_increase_karma(&job->base);
>>>> +    if (amdgpu_gpu_recovery != 2) {
>>>> +        if (job)
>>>> +            drm_sched_increase_karma(&job->base);
>>>> +    }
>>>>         /* Don't suspend on bare metal if we are not going to HW 
>>>> reset the ASIC */
>>>>         if (!amdgpu_sriov_vf(adev)) {
>>>> @@ -4639,7 +4672,7 @@ int amdgpu_device_gpu_recover(struct 
>>>> amdgpu_device *adev,
>>>>         int i, r = 0;
>>>>         bool need_emergency_restart = false;
>>>>         bool audio_suspended = false;
>>>> -
>>>> +    int    tmp_vram_lost_counter;
>>>>         /*
>>>>          * Special case: RAS triggered and full reset isn't supported
>>>>          */
>>>> @@ -4690,8 +4723,16 @@ int amdgpu_device_gpu_recover(struct 
>>>> amdgpu_device *adev,
>>>>                         job ? job->base.id : -1);
>>>>             /* even we skipped this reset, still need to set the job 
>>>> to guilty */
>>>> -        if (job)
>>>> -            drm_sched_increase_karma(&job->base);
>>>> +        if (job) {
>>>> +            if (amdgpu_gpu_recovery == 2) {
>>>> +                if (&job->base) {
>>>> +                    spin_lock(&job->base.sched->job_list_lock);
>>>> +                    list_add(&job->base.node, 
>>>> &job->base.sched->ring_mirror_list);
>>>> +                    spin_unlock(&job->base.sched->job_list_lock);
>>>> +                }
>>>> +            } else
>>>> +                drm_sched_increase_karma(&job->base);
>>>> +        }
>>>>             goto skip_recovery;
>>>>         }
>>>> @@ -4788,6 +4829,7 @@ int amdgpu_device_gpu_recover(struct 
>>>> amdgpu_device *adev,
>>>>             }
>>>>         }
>>>> +    tmp_vram_lost_counter = atomic_read(&((adev)->vram_lost_counter));
>>>>         /* Actual ASIC resets if needed.*/
>>>>         /* TODO Implement XGMI hive reset logic for SRIOV */
>>>>         if (amdgpu_sriov_vf(adev)) {
>>>> @@ -4804,18 +4846,64 @@ int amdgpu_device_gpu_recover(struct
>>>> amdgpu_device *adev,
>>>>         /* Post ASIC reset for all devs .*/
>>>>         list_for_each_entry(tmp_adev, device_list_handle, 
>>>> gmc.xgmi.head)
>>>> {
>>>> +        int step = 1;
>>>> +        if (amdgpu_gpu_recovery == 2) {
>>>> +            if 
>>>> (amdgpu_affinity_group_has_only_or_null_working_ring(adev,&job->base)
>>>> +                || tmp_vram_lost_counter < 
>>>> atomic_read(&adev->vram_lost_counter)) {
>>>> +                DRM_INFO("Skip Stage0 Resubmit Stage\n");
>>>> +                /* set guilty */
>>>> +                drm_sched_increase_karma(&job->base);
>>>> +                step = 1;
>>>> +            } else {
>>>> +                DRM_INFO("Do Stage0 Resubmit Stage\n");
>>>> +                step = 0;
>>>> +            }
>>>> +        }
>>>> +
>>>> +retry_resubmit:
>>>>             for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>>>                 struct amdgpu_ring *ring = tmp_adev->rings[i];
>>>> +            int ret = 0;
>>>> +            struct drm_sched_job *s_bad_job = NULL;
>>>>                 if (!ring || !ring->sched.thread)
>>>>                     continue;
>>>>                 /* No point to resubmit jobs if we didn't HW reset*/
>>>> -            if (!tmp_adev->asic_reset_res && !job_signaled)
>>>> +            if (!tmp_adev->asic_reset_res && !job_signaled) {
>>>> +
>>>>                     drm_sched_resubmit_jobs(&ring->sched);
>>>> -            drm_sched_start(&ring->sched, !tmp_adev->asic_reset_res);
>>>> +                if (amdgpu_gpu_recovery == 2 && step == 0) {
>>>> +                    ret = 
>>>> amdgpu_wait_resubmitted_jobs_completion(&ring->sched, 
>>>> ring->sched.timeout, &s_bad_job);
>>>> +                    if (ret == -1) {
>>>> +                        DRM_ERROR("Found the real bad job! ring:%s, 
>>>> job_id:%llx\n", ring->sched.name, s_bad_job->id);
>>>> +                        /* set guilty */
>>>> +                        drm_sched_increase_karma(s_bad_job);
>>>> +
>>>> +                        /* do hw reset */
>>>> +                        if (amdgpu_sriov_vf(adev)) {
>>>> +                            amdgpu_virt_fini_data_exchange(adev);
>>>> +                            r = amdgpu_device_reset_sriov(adev, 
>>>> false);
>>>> +                            if (r)
>>>> +                                adev->asic_reset_res = r;
>>>> +                        } else {
>>>> +                            r  = amdgpu_do_asic_reset(hive, 
>>>> device_list_handle, &need_full_reset, false);
>>>> +                            if (r && r == -EAGAIN)
>>>> +                                goto retry;
>>>> +                        }
>>>> +
>>>> +                        /* add reset counter so that the following 
>>>> resubmitted job could flush vmid */
>>>> +                        atomic_inc(&tmp_adev->gpu_reset_counter);
>>>> +                        step = 1;
>>>> +                        goto retry_resubmit;
>>>> +                    }
>>>> +                }
>>>> +            }
>>>> +
>>>> +            if (step == 1)
>>>> +                drm_sched_start(&ring->sched, 
>>>> !tmp_adev->asic_reset_res);
>>>>             }
>>>>             if (!amdgpu_device_has_dc_support(tmp_adev) && 
>>>> !job_signaled) {
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> index 865f924772b0..9c3f4edb7532 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> @@ -509,7 +509,7 @@ module_param_named(compute_multipipe, 
>>>> amdgpu_compute_multipipe, int, 0444);
>>>>      * DOC: gpu_recovery (int)
>>>>      * Set to enable GPU recovery mechanism (1 = enable, 0 = 
>>>> disable). The default is -1 (auto, disabled except SRIOV).
>>>>      */
>>>> -MODULE_PARM_DESC(gpu_recovery, "Enable GPU recovery mechanism, (1 =
>>>> enable, 0 = disable, -1 = auto)");
>>>> +MODULE_PARM_DESC(gpu_recovery, "Enable GPU recovery mechanism, (2 =
>>>> +advanced tdr mode, 1 = enable, 0 = disable, -1 = auto)");
>>>>     module_param_named(gpu_recovery, amdgpu_gpu_recovery, int, 0444);
>>>>     /**
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>> index 759b34799221..28cda321157a 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>> @@ -281,6 +281,53 @@ void amdgpu_job_stop_all_jobs_on_sched(struct 
>>>> drm_gpu_scheduler *sched)
>>>>         }
>>>>     }
>>>> +int amdgpu_wait_resubmitted_jobs_completion(struct drm_gpu_scheduler
>>>> +*sched, long timeout, struct drm_sched_job **s_bad_job) {
>>>> +    struct drm_sched_job *s_job, *tmp;
>>>> +    int ret = 0;
>>>> +
>>>> +    list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, 
>>>> node) {
>>>> +        struct drm_sched_fence *s_fence = s_job->s_fence;
>>>> +
>>>> +            if (s_fence->parent == NULL) { /* fail to get a hw 
>>>> fence */
>>>> +                /* process a job */
>>>> +                atomic_dec(&sched->num_jobs);
>>>> +                dma_fence_get(&s_fence->finished);
>>>> +                dma_fence_signal(&s_fence->finished);
>>>> +                dma_fence_put(&s_fence->finished);
>>>> +
>>>> +                /* remove node from mirror_list and free the job */
>>>> +                spin_lock(&sched->job_list_lock);
>>>> +                list_del_init(&s_job->node);
>>>> +                spin_unlock(&sched->job_list_lock);
>>>> +                sched->ops->free_job(s_job);
>>>> +                continue;
>>>> +            }
>>>> +
>>>> +            ret = dma_fence_wait_timeout(s_fence->parent, false, 
>>>> timeout);
>>>> +
>>>> +            if (ret > 0) { /* succeed */
>>>> +                /* process a job */
>>>> +                atomic_dec(&sched->num_jobs);
>>>> +                dma_fence_get(&s_fence->finished);
>>>> +                dma_fence_signal(&s_fence->finished);
>>>> +                dma_fence_put(&s_fence->finished);
>>>> +
>>>> +                /* remove node from mirror_list and free the job */
>>>> +                spin_lock(&sched->job_list_lock);
>>>> +                list_del_init(&s_job->node);
>>>> +                spin_unlock(&sched->job_list_lock);
>>>> +                sched->ops->free_job(s_job);
>>>> +                continue;
>>>> +            } else if (ret == 0) {
>>>> +                *s_bad_job = s_job;
>>>> +                return -1; /* timeout */
>>>> +            }
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>>     const struct drm_sched_backend_ops amdgpu_sched_ops = {
>>>>         .dependency = amdgpu_job_dependency,
>>>>         .run_job = amdgpu_job_run,
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>>>> index 81caac9b958a..25292f4699fb 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>>>> @@ -76,5 +76,5 @@ int amdgpu_job_submit_direct(struct amdgpu_job 
>>>> *job, struct amdgpu_ring *ring,
>>>>                      struct dma_fence **fence);
>>>>     void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler
>>>> *sched);
>>>> -
>>>> +int amdgpu_wait_resubmitted_jobs_completion(struct drm_gpu_scheduler
>>>> +*sched, long timeout, struct drm_sched_job **s_bad_job);
>>>>     #endif
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> index b644c78475fd..cb50bfc80bc9 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>>> @@ -35,6 +35,11 @@
>>>>     #include "amdgpu.h"
>>>>     #include "atom.h"
>>>> +static char *amdgpu_affinity_group[] = { "gfx", "comp"
>>>> +};
>>>> +
>>>>     /*
>>>>      * Rings
>>>>      * Most engines on the GPU are fed via ring buffers.  Ring @@
>>>> -189,6
>>>> +194,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct
>>>> +amdgpu_ring *ring,
>>>>             ring->adev = adev;
>>>>             ring->idx = adev->num_rings++;
>>>>             adev->rings[ring->idx] = ring;
>>>> +        amdgpu_ring_set_affinity_group(ring);
>>>>             r = amdgpu_fence_driver_init_ring(ring, 
>>>> sched_hw_submission);
>>>>             if (r)
>>>>                 return r;
>>>> @@ -459,3 +465,24 @@ int amdgpu_ring_test_helper(struct amdgpu_ring 
>>>> *ring)
>>>>         ring->sched.ready = !r;
>>>>         return r;
>>>>     }
>>>> +
>>>> +int amdgpu_ring_set_affinity_group(struct amdgpu_ring *ring) {
>>>> +       struct amdgpu_device *adev = ring->adev;
>>>> +       int i;
>>>> +
>>>> +       for (i = 0; i < ARRAY_SIZE(amdgpu_affinity_group); i++) {
>>>> +               char *temp_name = amdgpu_affinity_group[i];
>>>> +
>>>> +               /* set ring's affinity_group bit if find it in 
>>>> affinity_group list */
>>>> +               if (strncmp(ring->name, temp_name, 
>>>> strlen(temp_name)) == 0) {
>>>> +                       DRM_DEV_INFO(adev->dev, "set ring:%s in 
>>>> affinity_group\n",
>>>> +                             ring->name);
>>>> +                       ring->sched.affinity_group = 1;
>>>> +                       return 0;
>>>> +               }
>>>> +       }
>>>> +
>>>> +       ring->sched.affinity_group = 0;
>>>> +       return 0;
>>>> +}
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> index 56acec1075ac..6b0d217e6f5a 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>>> @@ -350,4 +350,5 @@ int amdgpu_debugfs_ring_init(struct 
>>>> amdgpu_device *adev,
>>>>                      struct amdgpu_ring *ring);
>>>>     void amdgpu_debugfs_ring_fini(struct amdgpu_ring *ring);
>>>> +int amdgpu_ring_set_affinity_group(struct amdgpu_ring *ring);
>>>>     #endif
>>>> diff --git a/include/drm/gpu_scheduler.h
>>>> b/include/drm/gpu_scheduler.h index 1c815e0a14ed..589cbaea35dc 100644
>>>> --- a/include/drm/gpu_scheduler.h
>>>> +++ b/include/drm/gpu_scheduler.h
>>>> @@ -301,6 +301,7 @@ struct drm_gpu_scheduler {
>>>>         atomic_t                        _score;
>>>>         bool                ready;
>>>>         bool                free_guilty;
>>>> +    int                affinity_group;
>>>>     };
>>>>     int drm_sched_init(struct drm_gpu_scheduler *sched,
>