[PATCH v3 1/1] drm/amdgpu: rework sched_list generation

Wed Apr 1 15:35:34 UTC 2020

On 4/1/20 5:02 PM, Luben Tuikov wrote:
> On 2020-03-31 08:46, Nirmoy wrote:
>> On 3/31/20 3:01 AM, Luben Tuikov wrote:
>>> This patch seems to be using DOS line-endings.
>>
>> Strange, I don't see that in my local patch file.
>>
> Not sure why "git am" complained about DOS endings
> the first time I downloaded it. Second time was fine.
>
> [snip]>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index 29f0a410091b..27abbdc603dd 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -721,6 +721,11 @@ struct amd_powerplay {
>>>>    	const struct amd_pm_funcs *pp_funcs;
>>>>    };
>>>>
>>>> +struct amdgpu_sched {
>>>> +	uint32_t			num_scheds;
>>>> +	struct drm_gpu_scheduler	*sched[HWIP_MAX_INSTANCE];
>>>> +};
>>>> +
>>>>    #define AMDGPU_RESET_MAGIC_NUM 64
>>>>    #define AMDGPU_MAX_DF_PERFMONS 4
>>>>    struct amdgpu_device {
>>>> @@ -858,6 +863,8 @@ struct amdgpu_device {
>>>>    	struct amdgpu_ring		*rings[AMDGPU_MAX_RINGS];
>>>>    	bool				ib_pool_ready;
>>>>    	struct amdgpu_sa_manager	ring_tmp_bo[AMDGPU_IB_POOL_MAX];
>>>> +	/* drm scheduler list */
>>>> +	struct amdgpu_sched		gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX];
>>> That's a 2-dimensional array of "struct amdgpu_sched".
>>> I think that the comment should be removed, or at least
>>> not say "drm scheduler list". (I can see the structure
>>> definition above.)
>>
>> Yes I should remove it.
>>
>>
>>> If this is the path you want to go, consider removing
>>> "num_scheds" and creating a three dimensional array,
>>> which would really essentialize the direction you want
>>> to go:
>>>
>>> struct drm_gpu_scheduler *gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX][HWIP_MAX_INSTANCE];
>>>
>>> Now that this architecture is stripped down to its essentials,
>>> perhaps we can see some optimizations...?
>>
>> If you mean whether we should see any performance improvement then imo
>> we may not see much
>>
>> difference as we are using pretty much same number of memory access plus
>> now we have extra cost of array_index_nospec().
>>
>> Also this is  not hot code path. We do only 1
>> amdgpu_ctx_init_entity()/HW_IP/Context.
> No, this has nothing to do with "performance".
> It's all about architecture and design.
>
> You seem to have array-array-struct with array and int,
> and it seems much cleaner to just have array-array-array.
> I think you don't need to break the chain with
> struct of int and array, just as I described
> in my comment below which you snipped without addressing it.
>
> If you're not going to address a comment, don't delete it, leave it
> for others to see that it hasn't been addressed. Only
> snip previously addressed and resolved comments and previously
> seen patch info, like diffstat/etc.

I wanted to  understand "running pointer" before I could comment in there.

>
>>> Also consider that since you're creating an array of pointers,
>>> you don't necessarily need to know their count. Your hot-path
>>> algorithms should not need to know it. If you need to print
>>> their count, say in sysfs, then you can count them up on
>>> behalf of the user-space process cat-ing the sysfs entry.
>>>
> [snip]
>
>>>> +
>>>> +	/* set to default prio if sched_list is NULL */
>>>> +	if (!adev->gpu_sched[hw_ip][hw_prio].num_scheds)
>>>> +		hw_prio = AMDGPU_RING_PRIO_DEFAULT;
>>> That comment is a bit confusing--it talks about a list
>>> not being NULL, while the conditional implicitly checks
>>> against 0.
>>
>> Yes, this is wrong, will remove it.
>>
>> <snip>
>>
> I wish you hadn't snipped my comment here, but address it
> one way or the other. It is:
>
> I'd much rather that integer comparison be performed against
> integers, as opposed to using logical NOT operator (which is
> implicit in comparing against 0), i.e.,
>
> if (adev->gpu_sched[hw_ip][hw_prio].num_scheds == 0)
>
> Also, architecturally, there seems to be informational
> redundancy, in keeping an integer count and list of
> objects at the same time, as the above if-conditional
> exposes: the comment talks about a list and NULL but
> the if-conditional implicitly checks for 0.

Number of valid drm scheduler in adev->gpu_sched[hw_ip][hw_prio].sched 
will vary depending on priority and hw_ip.

We need to pass that scheduler  array and num_scheds to 
drm_sched_entity_init(). I think we often use

array and integer count together when the number of valid items in the 
array is dynamic.

>
> Perhaps, we don't need "num_scheds" and you can just
> check if the index is NULL and assign PRIO_DEFAULT.
>
>>> @@ -258,6 +272,12 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
>>>    	ring->priority = DRM_SCHED_PRIORITY_NORMAL;
>>>    	mutex_init(&ring->priority_mutex);
>>>
>>> +	if (ring->funcs->type != AMDGPU_RING_TYPE_KIQ) {
>>> +		hw_ip = amdgpu_ring_type_to_drm_hw_ip[ring->funcs->type];
>>> +		num_sched = &adev->gpu_sched[hw_ip][hw_prio].num_scheds;
>>> +		adev->gpu_sched[hw_ip][hw_prio].sched[(*num_sched)++] = &ring->sched;
>>> +	}
>>> This seems unnecessarily complicated. Perhaps we can remove
>>> "num_scheds", as recommended above, and keep a running pointer
>>> while initialization is being done...?
>>
>> What do you mean by running pointer ?
> A "running pointer" is a local pointer you're using temporarily
> to traverse memory. If you remove the "num_scheds", as noted in my
> previous comment, then you can use a running pointer to contain
> the next entry you'd initialize.

Okay I understand now. As I said we need to know the valid number of 
scheduler  in a sched array so that we can pass that information to

drm_sched_entity_init() .

Regards,

Nirmoy

>
> Regards,
> Luben