[PATCH v7 4/4] drm/xe/vf: Fixup CTB send buffer messages after migration
Lis, Tomasz
tomasz.lis at intel.com
Fri Apr 11 14:34:01 UTC 2025
On 10.04.2025 20:24, Michal Wajdeczko wrote:
> On 09.04.2025 23:09, Lis, Tomasz wrote:
>> On 08.04.2025 16:23, Michal Wajdeczko wrote:
>>> On 03.04.2025 20:40, Tomasz Lis wrote:
> ...
>
>>>> + u32 msg[1];
>>> u32 msg;
>>> or
>>> u32 msg[GUC_HXG_MSG_MIN_LEN];
>> I want an array, so will go with 2nd. Though this looks like unjustified
>> obfuscation to me.
> using magic "1" is an obfuscation to me
It's not magic, the array has one entry because only one entry is used
later.
Naming it something which one could theorize is any value, obfuscates
the real value of "1".
It makes no sense to think of this value as some "min length".
So maybe I used a wrong term - this is not just obfuscation, this is
misinformation.
> ....
>
>>>> + struct guc_ctb *h2g,
>>> nit: this is redundant
>> oh I see. by "this" you mean a spaced out separate line.
> no, I just repeat that all your fixup is about H2G and since we pass ct
> each helper function can easily find pointer to h2g
>
> passing struct guc_ctb * as separate parameter may suggest that
> something other than h2g could be handled (but we don't need this)
From the other comments, it would seem you think it is important for
the code to correspond to the idea, or mindset, behind it.
The idea behind this function is to fix a message in specific `guc_ctb`.
Why do we want to hide this idea by optimizing out a parameter in this
specific case, but we're adding incorrect parameters (meaning ones that
are not used directly, need to be converted to another object)
everywhere else? What is the difference?
> ...
>
>>>> + /* Read header */
>>> you should at least assert that avail > 0 before reading even single u32
>>>
>>> xe_gt_assert(gt, avail >= GUC_CTB_MSG_MIN_LEN);
>> This is a static function, and the caller is ensuring that. But sure, I
>> can add the assert.
>>> but since it is called in the loop, more appropriate would be:
>>>
>>> if (avail < GUC_CTB_MSG_MIN_LEN)
>>> goto broken;
>> I don't know what you mean by that. There are not that many
>> possibilities for an integer value greater than 0 but smaller than 1.
> caller just checks for avail > 0
> but we shall check for avail >= GUC_CTB_MSG_MIN_LEN
>
> and we should not make any assumptions about GUC_CTB_MSG_MIN_LEN value
In this round I finally agreed to the changes which removed any size
read (xe_map_memcpy_from) and replaced that with one dword read
(xe_map_rd_array_u32).
We are now making assumptions about GUC_CTB_MSG_MIN_LEN. The only
supported value is 1.
And even if it wasn't the case, the value is and always will be 1. We
are creating false narrative around the simple value of 1. It is just 1
and can't be anything else. Change to that would require a rewrite, but
it's not even worth considering as that will never happen.
"avail > 0" and "avail >= 1" are equivalent.
>> And by the changes I agreed to above and below, we've already ensured
>> that GUC_CTB_MSG_MIN_LEN can only be equal to 1.
> we are adding checks for 'avail'
> we are not making new assumptions about ABI values
>
> ...
>
>>> maybe it's time to introduce
>>>
>>> u32 move_head(u32 head, u32 step, u32 size)
>> We are overdefining a lot of things already. I don't think we should
>> press even harder in that direction.
> it's about avoiding duplicated code and wrap it into named helper
This is why we introduce helpers, true. But now we're starting to do
this for trivial stuff.
Every programmer understands these:
head = (head + GUC_CTB_MSG_MIN_LEN) % size;
*mhead = (head + msg_len_to_hxg_len(len)) % size;
This is trivial. The lines are self explanatory. Making a sub-function
to hide that isn't providing any benefit, It's harmful, as it requires
the reader to check the sub-function instead of seeing plainly and
directly what's being done. It also leads to more lines of code.
> ...
>
>>>> +
>>>> +/**
>>>> + * xe_guc_ct_fixup_messages_with_ggtt - Fixup any pending H2G CTB
>>>> messages
>>>> + * @ct: pointer to CT struct of the target GuC
>>>> + * @ggtt_shift: shift to be added to all GGTT addresses within the CTB
>>>> + *
>>>> + * Messages in guc-to-host CTB are owned by GuC and any fixups in them
>>>> + * are made by GuC. But content of the host-to-guc CTB is owned by the
>>>> + * KMD, so fixups to GGTT references in any pending messages need to be
>>>> + * applied here.
>>> s/guc-to-host/H2G
>>>
>>> like you have earlier and below
>> Earlier is a short description. Below I am using shortcut after defining
>> it in this line. This is how the long comments should look like,
>> abbreviations should be defined.
> but I guess you should be aware right now that your code is not the
> first that introduces the CTB feature, including G2H and H2G concept and
> naming,
ok, interesting argument. I didn't know we have the concept of "first"
introduction which defines abbreviations.
Where is it? I see two candidates:
1. ./drivers/gpu/drm/xe/xe_guc_ct_types.h: /** @ctbs.recv: GuC to
Host (G2H, receive) channel */
This one is a part of struct declaration, so it clearly isn't the first,
right?
2. /drivers/gpu/drm/xe/abi/guc_communication_mmio_abi.h: * Each MMIO
based message, both Host to GuC (H2G) and GuC to Host (G2H)
This one is in MMIO specific file. And we're discussing a comment to a
CTB-related function. So.. not this one?
Where is that "first" introduction which defined it? What rule do we
have here as a general concept, one expansion per driver is enough? Or
one expansion per file? One per feature?
Or.. is it all just arbitrary. We have no rules. In which case this
place is in no way worse than any other. Maybe every full description
should expand it.
> so really above kernel-doc is not a right place to suddenly use
> "guc-to-host" term (which is a bad notation anyway)
Right, in the two other places the notation is different.
To be honest I'm not really against using a shortcut here, I just do not
understand the argumentation in this review comment.
With the justification provided, I'd consider this as below the
threshold of things worth commenting in a review.
-Tomasz
>
More information about the Intel-xe
mailing list