[PATCH] drm/amdkfd: fix missing L2 cache info in topology
Eric Huang
jinhuieric.huang at amd.com
Fri Feb 7 15:29:17 UTC 2025
On 2025-02-06 22:41, Lazar, Lijo wrote:
>
> On 2/6/2025 10:18 PM, Eric Huang wrote:
>> I understand your concern. KFD currently only reports one L2 instance,
>> but not every L2 instance. If customers want to have more detail in all
>> available L2 info, we probably can change the logic in this function,
>> but it is not related to my change. My change is based on current kfd
>> logic and fixes missing L2 issue.
>>
> Even for that case, do you need to loop through all xccs? Expectation is
> there are some set of active CUs in any XCC (in general, XCC without an
> active CU is not expected to part of KFD node).
Good point. I will send out another patch accordingly.
Thanks,
Eric
>
> Thanks,
> Lijo
>
>> Thanks,
>> Eric
>>
>> On 2025-02-06 11:37, Lazar, Lijo wrote:
>>> [Public]
>>>
>>>
>>> Yes, the problem is that. If a node has 2 XCCs, it should report the
>>> L2 of each separately with the number of CUs sharing each L2.
>>>
>>> In this, it appears to loop through and find the first non-zero of all
>>> XCCs of a node and not based on the first non-zero per XCC basis. It
>>> makes a difference in number of L2 instances available.
>>>
>>>
>>> Thanks,
>>> Lijo
>>> ------------------------------------------------------------------------
>>> *From:* Huang, JinHuiEric <JinHuiEric.Huang at amd.com>
>>> *Sent:* Thursday, February 6, 2025 10:00:38 PM
>>> *To:* Lazar, Lijo <Lijo.Lazar at amd.com>; amd-gfx at lists.freedesktop.org
>>> <amd-gfx at lists.freedesktop.org>
>>> *Subject:* Re: [PATCH] drm/amdkfd: fix missing L2 cache info in topology
>>>
>>>
>>> On 2025-02-06 10:14, Lazar, Lijo wrote:
>>>> On 1/29/2025 8:50 PM, Eric Huang wrote:
>>>>> In some ASICs L2 cache info may miss in kfd topology,
>>>>> because the first bitmap may be empty, that means
>>>>> the first cu may be inactive, so to find the first
>>>>> active cu will solve the issue.
>>>>>
>>>>> Signed-off-by: Eric Huang <jinhuieric.huang at amd.com>
>>>>> ---
>>>>> drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 18 ++++++++++++++++--
>>>>> 1 file changed, 16 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/
>>> gpu/drm/amd/amdkfd/kfd_topology.c
>>>>> index 4936697e6fc2..73d95041a388 100644
>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>>>> @@ -1665,17 +1665,31 @@ static int fill_in_l2_l3_pcache(struct
>>> kfd_cache_properties **props_ext,
>>>>> int cache_type, unsigned int
>>> cu_processor_id,
>>>>> struct kfd_node *knode)
>>>>> {
>>>>> - unsigned int cu_sibling_map_mask;
>>>>> + unsigned int cu_sibling_map_mask = 0;
>>>>> int first_active_cu;
>>>>> int i, j, k, xcc, start, end;
>>>>> int num_xcc = NUM_XCC(knode->xcc_mask);
>>>>> struct kfd_cache_properties *pcache = NULL;
>>>>> enum amdgpu_memory_partition mode;
>>>>> struct amdgpu_device *adev = knode->adev;
>>>>> + bool found = false;
>>>>>
>>>>> start = ffs(knode->xcc_mask) - 1;
>>>>> end = start + num_xcc;
>>>>> - cu_sibling_map_mask = cu_info->bitmap[start][0][0];
>>>>> +
>>>>> + /* To find the bitmap in the first active cu */
>>>>> + for (xcc = start; xcc < end && !found; xcc++) {
>>>> It seems there is an assumption made here that a CU in one XCC could
>>>> share this cache with CU in another XCC. This is not true for GFX 9.4.3
>>>> SOCs. In those, a CU in XCC0 doesn't share L2 with CU in XCC1.
>>> In KFD topology we only report L2 cache info of the first active cu in A
>>> XCC, which could be XCC0 or XCC1. It is generic for L2 info in the
>>> certain XCP/kfd node, and not specific for every XCC, so it doesn't mean
>>> the L2 cache found in XCC0 can be shared with XCC1, it only means there
>>> is L2 cache in this kfd node.
>>>
>>> Regards,
>>> Eric
>>>> Thanks,
>>>> Lijo
>>>>
>>>>> + for (i = 0; i < gfx_info->max_shader_engines && !
>>> found; i++) {
>>>>> + for (j = 0; j < gfx_info->max_sh_per_se && !
>>> found; j++) {
>>>>> + if (cu_info->bitmap[xcc][i % 4][j % 4]) {
>>>>> + cu_sibling_map_mask =
>>>>> + cu_info->bitmap[xcc][i
>>> % 4][j % 4];
>>>>> + found = true;
>>>>> + }
>>>>> + }
>>>>> + }
>>>>> + }
>>>>> +
>>>>> cu_sibling_map_mask &=
>>>>> ((1 << pcache_info[cache_type].num_cu_shared) - 1);
>>>>> first_active_cu = ffs(cu_sibling_map_mask);
More information about the amd-gfx
mailing list