[PATCH] drm/amdkfd: fix missing L2 cache info in topology
Lazar, Lijo
lijo.lazar at amd.com
Fri Feb 7 03:41:45 UTC 2025
On 2/6/2025 10:18 PM, Eric Huang wrote:
> I understand your concern. KFD currently only reports one L2 instance,
> but not every L2 instance. If customers want to have more detail in all
> available L2 info, we probably can change the logic in this function,
> but it is not related to my change. My change is based on current kfd
> logic and fixes missing L2 issue.
>
Even for that case, do you need to loop through all xccs? Expectation is
there are some set of active CUs in any XCC (in general, XCC without an
active CU is not expected to part of KFD node).
Thanks,
Lijo
> Thanks,
> Eric
>
> On 2025-02-06 11:37, Lazar, Lijo wrote:
>>
>> [Public]
>>
>>
>> Yes, the problem is that. If a node has 2 XCCs, it should report the
>> L2 of each separately with the number of CUs sharing each L2.
>>
>> In this, it appears to loop through and find the first non-zero of all
>> XCCs of a node and not based on the first non-zero per XCC basis. It
>> makes a difference in number of L2 instances available.
>>
>>
>> Thanks,
>> Lijo
>> ------------------------------------------------------------------------
>> *From:* Huang, JinHuiEric <JinHuiEric.Huang at amd.com>
>> *Sent:* Thursday, February 6, 2025 10:00:38 PM
>> *To:* Lazar, Lijo <Lijo.Lazar at amd.com>; amd-gfx at lists.freedesktop.org
>> <amd-gfx at lists.freedesktop.org>
>> *Subject:* Re: [PATCH] drm/amdkfd: fix missing L2 cache info in topology
>>
>>
>> On 2025-02-06 10:14, Lazar, Lijo wrote:
>> >
>> > On 1/29/2025 8:50 PM, Eric Huang wrote:
>> >> In some ASICs L2 cache info may miss in kfd topology,
>> >> because the first bitmap may be empty, that means
>> >> the first cu may be inactive, so to find the first
>> >> active cu will solve the issue.
>> >>
>> >> Signed-off-by: Eric Huang <jinhuieric.huang at amd.com>
>> >> ---
>> >> drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 18 ++++++++++++++++--
>> >> 1 file changed, 16 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/
>> gpu/drm/amd/amdkfd/kfd_topology.c
>> >> index 4936697e6fc2..73d95041a388 100644
>> >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> >> @@ -1665,17 +1665,31 @@ static int fill_in_l2_l3_pcache(struct
>> kfd_cache_properties **props_ext,
>> >> int cache_type, unsigned int
>> cu_processor_id,
>> >> struct kfd_node *knode)
>> >> {
>> >> - unsigned int cu_sibling_map_mask;
>> >> + unsigned int cu_sibling_map_mask = 0;
>> >> int first_active_cu;
>> >> int i, j, k, xcc, start, end;
>> >> int num_xcc = NUM_XCC(knode->xcc_mask);
>> >> struct kfd_cache_properties *pcache = NULL;
>> >> enum amdgpu_memory_partition mode;
>> >> struct amdgpu_device *adev = knode->adev;
>> >> + bool found = false;
>> >>
>> >> start = ffs(knode->xcc_mask) - 1;
>> >> end = start + num_xcc;
>> >> - cu_sibling_map_mask = cu_info->bitmap[start][0][0];
>> >> +
>> >> + /* To find the bitmap in the first active cu */
>> >> + for (xcc = start; xcc < end && !found; xcc++) {
>> > It seems there is an assumption made here that a CU in one XCC could
>> > share this cache with CU in another XCC. This is not true for GFX 9.4.3
>> > SOCs. In those, a CU in XCC0 doesn't share L2 with CU in XCC1.
>> In KFD topology we only report L2 cache info of the first active cu in A
>> XCC, which could be XCC0 or XCC1. It is generic for L2 info in the
>> certain XCP/kfd node, and not specific for every XCC, so it doesn't mean
>> the L2 cache found in XCC0 can be shared with XCC1, it only means there
>> is L2 cache in this kfd node.
>>
>> Regards,
>> Eric
>> >
>> > Thanks,
>> > Lijo
>> >
>> >> + for (i = 0; i < gfx_info->max_shader_engines && !
>> found; i++) {
>> >> + for (j = 0; j < gfx_info->max_sh_per_se && !
>> found; j++) {
>> >> + if (cu_info->bitmap[xcc][i % 4][j % 4]) {
>> >> + cu_sibling_map_mask =
>> >> + cu_info->bitmap[xcc][i
>> % 4][j % 4];
>> >> + found = true;
>> >> + }
>> >> + }
>> >> + }
>> >> + }
>> >> +
>> >> cu_sibling_map_mask &=
>> >> ((1 << pcache_info[cache_type].num_cu_shared) - 1);
>> >> first_active_cu = ffs(cu_sibling_map_mask);
>>
>
More information about the amd-gfx
mailing list