[PATCH v3 1/1] drm/amdgpu: rework sched_list generation
Luben Tuikov
luben.tuikov at amd.com
Wed Apr 1 15:02:38 UTC 2020
On 2020-03-31 08:46, Nirmoy wrote:
>
> On 3/31/20 3:01 AM, Luben Tuikov wrote:
>> This patch seems to be using DOS line-endings.
>
>
> Strange, I don't see that in my local patch file.
>
Not sure why "git am" complained about DOS endings
the first time I downloaded it. Second time was fine.
[snip]>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 29f0a410091b..27abbdc603dd 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -721,6 +721,11 @@ struct amd_powerplay {
>>> const struct amd_pm_funcs *pp_funcs;
>>> };
>>>
>>> +struct amdgpu_sched {
>>> + uint32_t num_scheds;
>>> + struct drm_gpu_scheduler *sched[HWIP_MAX_INSTANCE];
>>> +};
>>> +
>>> #define AMDGPU_RESET_MAGIC_NUM 64
>>> #define AMDGPU_MAX_DF_PERFMONS 4
>>> struct amdgpu_device {
>>> @@ -858,6 +863,8 @@ struct amdgpu_device {
>>> struct amdgpu_ring *rings[AMDGPU_MAX_RINGS];
>>> bool ib_pool_ready;
>>> struct amdgpu_sa_manager ring_tmp_bo[AMDGPU_IB_POOL_MAX];
>>> + /* drm scheduler list */
>>> + struct amdgpu_sched gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX];
>> That's a 2-dimensional array of "struct amdgpu_sched".
>> I think that the comment should be removed, or at least
>> not say "drm scheduler list". (I can see the structure
>> definition above.)
>
>
> Yes I should remove it.
>
>
>> If this is the path you want to go, consider removing
>> "num_scheds" and creating a three dimensional array,
>> which would really essentialize the direction you want
>> to go:
>>
>> struct drm_gpu_scheduler *gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX][HWIP_MAX_INSTANCE];
>>
>> Now that this architecture is stripped down to its essentials,
>> perhaps we can see some optimizations...?
>
>
> If you mean whether we should see any performance improvement then imo
> we may not see much
>
> difference as we are using pretty much same number of memory access plus
> now we have extra cost of array_index_nospec().
>
> Also this is not hot code path. We do only 1
> amdgpu_ctx_init_entity()/HW_IP/Context.
No, this has nothing to do with "performance".
It's all about architecture and design.
You seem to have array-array-struct with array and int,
and it seems much cleaner to just have array-array-array.
I think you don't need to break the chain with
struct of int and array, just as I described
in my comment below which you snipped without addressing it.
If you're not going to address a comment, don't delete it, leave it
for others to see that it hasn't been addressed. Only
snip previously addressed and resolved comments and previously
seen patch info, like diffstat/etc.
>> Also consider that since you're creating an array of pointers,
>> you don't necessarily need to know their count. Your hot-path
>> algorithms should not need to know it. If you need to print
>> their count, say in sysfs, then you can count them up on
>> behalf of the user-space process cat-ing the sysfs entry.
>>
[snip]
>>> +
>>> + /* set to default prio if sched_list is NULL */
>>> + if (!adev->gpu_sched[hw_ip][hw_prio].num_scheds)
>>> + hw_prio = AMDGPU_RING_PRIO_DEFAULT;
>> That comment is a bit confusing--it talks about a list
>> not being NULL, while the conditional implicitly checks
>> against 0.
>
>
> Yes, this is wrong, will remove it.
>
> <snip>
>
I wish you hadn't snipped my comment here, but address it
one way or the other. It is:
I'd much rather that integer comparison be performed against
integers, as opposed to using logical NOT operator (which is
implicit in comparing against 0), i.e.,
if (adev->gpu_sched[hw_ip][hw_prio].num_scheds == 0)
Also, architecturally, there seems to be informational
redundancy, in keeping an integer count and list of
objects at the same time, as the above if-conditional
exposes: the comment talks about a list and NULL but
the if-conditional implicitly checks for 0.
Perhaps, we don't need "num_scheds" and you can just
check if the index is NULL and assign PRIO_DEFAULT.
>> @@ -258,6 +272,12 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
>> ring->priority = DRM_SCHED_PRIORITY_NORMAL;
>> mutex_init(&ring->priority_mutex);
>>
>> + if (ring->funcs->type != AMDGPU_RING_TYPE_KIQ) {
>> + hw_ip = amdgpu_ring_type_to_drm_hw_ip[ring->funcs->type];
>> + num_sched = &adev->gpu_sched[hw_ip][hw_prio].num_scheds;
>> + adev->gpu_sched[hw_ip][hw_prio].sched[(*num_sched)++] = &ring->sched;
>> + }
>> This seems unnecessarily complicated. Perhaps we can remove
>> "num_scheds", as recommended above, and keep a running pointer
>> while initialization is being done...?
>
>
> What do you mean by running pointer ?
A "running pointer" is a local pointer you're using temporarily
to traverse memory. If you remove the "num_scheds", as noted in my
previous comment, then you can use a running pointer to contain
the next entry you'd initialize.
Regards,
Luben
More information about the amd-gfx
mailing list