[PATCH v3 1/1] drm/amdgpu: rework sched_list generation
Nirmoy
nirmodas at amd.com
Wed Apr 1 15:35:34 UTC 2020
On 4/1/20 5:02 PM, Luben Tuikov wrote:
> On 2020-03-31 08:46, Nirmoy wrote:
>> On 3/31/20 3:01 AM, Luben Tuikov wrote:
>>> This patch seems to be using DOS line-endings.
>>
>> Strange, I don't see that in my local patch file.
>>
> Not sure why "git am" complained about DOS endings
> the first time I downloaded it. Second time was fine.
>
> [snip]>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index 29f0a410091b..27abbdc603dd 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -721,6 +721,11 @@ struct amd_powerplay {
>>>> const struct amd_pm_funcs *pp_funcs;
>>>> };
>>>>
>>>> +struct amdgpu_sched {
>>>> + uint32_t num_scheds;
>>>> + struct drm_gpu_scheduler *sched[HWIP_MAX_INSTANCE];
>>>> +};
>>>> +
>>>> #define AMDGPU_RESET_MAGIC_NUM 64
>>>> #define AMDGPU_MAX_DF_PERFMONS 4
>>>> struct amdgpu_device {
>>>> @@ -858,6 +863,8 @@ struct amdgpu_device {
>>>> struct amdgpu_ring *rings[AMDGPU_MAX_RINGS];
>>>> bool ib_pool_ready;
>>>> struct amdgpu_sa_manager ring_tmp_bo[AMDGPU_IB_POOL_MAX];
>>>> + /* drm scheduler list */
>>>> + struct amdgpu_sched gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX];
>>> That's a 2-dimensional array of "struct amdgpu_sched".
>>> I think that the comment should be removed, or at least
>>> not say "drm scheduler list". (I can see the structure
>>> definition above.)
>>
>> Yes I should remove it.
>>
>>
>>> If this is the path you want to go, consider removing
>>> "num_scheds" and creating a three dimensional array,
>>> which would really essentialize the direction you want
>>> to go:
>>>
>>> struct drm_gpu_scheduler *gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX][HWIP_MAX_INSTANCE];
>>>
>>> Now that this architecture is stripped down to its essentials,
>>> perhaps we can see some optimizations...?
>>
>> If you mean whether we should see any performance improvement then imo
>> we may not see much
>>
>> difference as we are using pretty much same number of memory access plus
>> now we have extra cost of array_index_nospec().
>>
>> Also this is not hot code path. We do only 1
>> amdgpu_ctx_init_entity()/HW_IP/Context.
> No, this has nothing to do with "performance".
> It's all about architecture and design.
>
> You seem to have array-array-struct with array and int,
> and it seems much cleaner to just have array-array-array.
> I think you don't need to break the chain with
> struct of int and array, just as I described
> in my comment below which you snipped without addressing it.
>
> If you're not going to address a comment, don't delete it, leave it
> for others to see that it hasn't been addressed. Only
> snip previously addressed and resolved comments and previously
> seen patch info, like diffstat/etc.
I wanted to understand "running pointer" before I could comment in there.
>
>>> Also consider that since you're creating an array of pointers,
>>> you don't necessarily need to know their count. Your hot-path
>>> algorithms should not need to know it. If you need to print
>>> their count, say in sysfs, then you can count them up on
>>> behalf of the user-space process cat-ing the sysfs entry.
>>>
> [snip]
>
>>>> +
>>>> + /* set to default prio if sched_list is NULL */
>>>> + if (!adev->gpu_sched[hw_ip][hw_prio].num_scheds)
>>>> + hw_prio = AMDGPU_RING_PRIO_DEFAULT;
>>> That comment is a bit confusing--it talks about a list
>>> not being NULL, while the conditional implicitly checks
>>> against 0.
>>
>> Yes, this is wrong, will remove it.
>>
>> <snip>
>>
> I wish you hadn't snipped my comment here, but address it
> one way or the other. It is:
>
> I'd much rather that integer comparison be performed against
> integers, as opposed to using logical NOT operator (which is
> implicit in comparing against 0), i.e.,
>
> if (adev->gpu_sched[hw_ip][hw_prio].num_scheds == 0)
>
> Also, architecturally, there seems to be informational
> redundancy, in keeping an integer count and list of
> objects at the same time, as the above if-conditional
> exposes: the comment talks about a list and NULL but
> the if-conditional implicitly checks for 0.
Number of valid drm scheduler in adev->gpu_sched[hw_ip][hw_prio].sched
will vary depending on priority and hw_ip.
We need to pass that scheduler array and num_scheds to
drm_sched_entity_init(). I think we often use
array and integer count together when the number of valid items in the
array is dynamic.
>
> Perhaps, we don't need "num_scheds" and you can just
> check if the index is NULL and assign PRIO_DEFAULT.
>
>>> @@ -258,6 +272,12 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
>>> ring->priority = DRM_SCHED_PRIORITY_NORMAL;
>>> mutex_init(&ring->priority_mutex);
>>>
>>> + if (ring->funcs->type != AMDGPU_RING_TYPE_KIQ) {
>>> + hw_ip = amdgpu_ring_type_to_drm_hw_ip[ring->funcs->type];
>>> + num_sched = &adev->gpu_sched[hw_ip][hw_prio].num_scheds;
>>> + adev->gpu_sched[hw_ip][hw_prio].sched[(*num_sched)++] = &ring->sched;
>>> + }
>>> This seems unnecessarily complicated. Perhaps we can remove
>>> "num_scheds", as recommended above, and keep a running pointer
>>> while initialization is being done...?
>>
>> What do you mean by running pointer ?
> A "running pointer" is a local pointer you're using temporarily
> to traverse memory. If you remove the "num_scheds", as noted in my
> previous comment, then you can use a running pointer to contain
> the next entry you'd initialize.
Okay I understand now. As I said we need to know the valid number of
scheduler in a sched array so that we can pass that information to
drm_sched_entity_init() .
Regards,
Nirmoy
>
> Regards,
> Luben
More information about the amd-gfx
mailing list