[PATCH 2/7] drm/amdgpu: use scheduler load balancing for SDMA CS

Fri Aug 3 04:53:06 UTC 2018

On 08/02/2018 06:09 PM, Christian König wrote:
> Am 02.08.2018 um 07:50 schrieb Zhang, Jerry (Junwei):
>> On 08/01/2018 07:31 PM, Christian König wrote:
>>> Start to use the scheduler load balancing for userspace SDMA
>>> command submissions.
>>>
>>
>> In this case, each SDMA could load all SDMA(instances) rqs, and UMD will not specify a ring id.
>> If so, we may abstract a set of rings for each type of IP, associated with such kind of IP instances' rq.
>
> That's what my follow up patch set does.

I missed patch 7, that answer my concerns
(it's in another mail inbox...)

command will be sent to the expected entity, scheduler would decide which ring is actually used.

Feel free to add

Reviewed-by: Junwei Zhang <Jerry.Zhang at amd.com>

>
>> Accordingly libdrm needs to update as well, but it may be more user-friendly, regardless of ring id when submits command.
>
> No, libdrm and userspace should and must stay as they are. Userspace should not notice that we move jobs to another ring in the kernel.

libdrm test uses instance for ring id, that confuses me a bit.

Regards,
Jerry

>
>>
>> And will it interfere ctx->ring's settings, like sequence?
>
> No, take a look at the patch. There is no ctx->ring any more.
>
> Regards,
> Christian.
>
>> e.g. submit a command to SDMA0, but SDMA0 is busy and SDMA1 is idle,
>> So the command will be pushed to SDMA1 rq, but update sequence for ctx->ring[SDMA0] actually.
>>
>>
>> Regards,
>> Jerry
>>
>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 25 +++++++++++++++++++++----
>>>   1 file changed, 21 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> index df6965761046..59046f68975a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> @@ -48,7 +48,8 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
>>>                  struct drm_file *filp,
>>>                  struct amdgpu_ctx *ctx)
>>>   {
>>> -    unsigned i, j;
>>> +    struct drm_sched_rq *sdma_rqs[AMDGPU_MAX_RINGS];
>>> +    unsigned i, j, num_sdma_rqs;
>>>       int r;
>>>
>>>       if (priority < 0 || priority >= DRM_SCHED_PRIORITY_MAX)
>>> @@ -80,18 +81,34 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
>>>       ctx->init_priority = priority;
>>>       ctx->override_priority = DRM_SCHED_PRIORITY_UNSET;
>>>
>>> -    /* create context entity for each ring */
>>> +    num_sdma_rqs = 0;
>>>       for (i = 0; i < adev->num_rings; i++) {
>>>           struct amdgpu_ring *ring = adev->rings[i];
>>>           struct drm_sched_rq *rq;
>>>
>>>           rq = &ring->sched.sched_rq[priority];
>>> +        if (ring->funcs->type == AMDGPU_RING_TYPE_SDMA)
>>> +            sdma_rqs[num_sdma_rqs++] = rq;
>>> +    }
>>> +
>>> +    /* create context entity for each ring */
>>> +    for (i = 0; i < adev->num_rings; i++) {
>>> +        struct amdgpu_ring *ring = adev->rings[i];
>>>
>>>           if (ring == &adev->gfx.kiq.ring)
>>>               continue;
>>>
>>> -        r = drm_sched_entity_init(&ctx->rings[i].entity,
>>> -                      &rq, 1, &ctx->guilty);
>>> +        if (ring->funcs->type == AMDGPU_RING_TYPE_SDMA) {
>>> +            r = drm_sched_entity_init(&ctx->rings[i].entity,
>>> +                          sdma_rqs, num_sdma_rqs,
>>> +                          &ctx->guilty);
>>> +        } else {
>>> +            struct drm_sched_rq *rq;
>>> +
>>> +            rq = &ring->sched.sched_rq[priority];
>>> +            r = drm_sched_entity_init(&ctx->rings[i].entity,
>>> +                          &rq, 1, &ctx->guilty);
>>> +        }
>>>           if (r)
>>>               goto failed;
>>>       }
>>>
>