[Intel-gfx] [PATCH 1/4] drm/i915/gt: Remove platform comments from workarounds
Lucas De Marchi
lucas.demarchi at intel.com
Fri Dec 23 18:28:07 UTC 2022
On Fri, Dec 23, 2022 at 09:02:35AM +0000, Tvrtko Ursulin wrote:
>
>On 22/12/2022 15:55, Lucas De Marchi wrote:
>>On Thu, Dec 22, 2022 at 10:27:00AM +0000, Tvrtko Ursulin wrote:
>>>
>>>On 22/12/2022 08:25, Lucas De Marchi wrote:
>>>>The comments are redundant to the checks being done to apply the
>>>>workarounds and very often get outdated as workarounds need to be
>>>>extended to new platforms or steppings. Remove them altogether with
>>>>the following matches (platforms extracted from intel_workarounds.c):
>>>>
>>>> find drivers/gpu/drm/i915/gt/ -name '*.c' | xargs sed -i -E \
>>>>'s/(Wa.*):(bdw|chv|bxt|glk|skl|kbl|cfl|cfl|whl|cml|aml|chv|cl|bw|ctg|elk|ilk|snb|dg|pvc|g4x|ilk|gen|glk|kbl|cml|glk|kbl|cml|hsw|icl|ehl|ivb|hsw|ivb|vlv|kbl|pvc|rkl|dg|adl|skl|skl|bxt|blk|cfl|cnl|glk|snb|tgl|vlv|xehpsdv).*/\1/'
>>>> find drivers/gpu/drm/i915/gt/ -name '*.c' | xargs sed -i -E \
>>>>'s/(Wa.*):(bdw|chv|bxt|glk|skl|kbl|cfl|cfl|whl|cml|aml|chv|cl|bw|ctg|elk|ilk|snb|dg|pvc|g4x|ilk|gen|glk|kbl|cml|glk|kbl|cml|hsw|icl|ehl|ivb|hsw|ivb|vlv|kbl|pvc|rkl|dg|adl|skl|skl|bxt|blk|cfl|cnl|glk|snb|tgl|vlv|xehpsdv).*\*\//\1
>>>>
>>>>Same things was executed in the gem directory, omitted here for brevity.
>>>
>>>>There were a few false positives that included the workaround
>>>>description. Those were manually patched.
>>>
>>>sed -E 's/(Wa[a-zA-Z0-9_]+)[:,]([a-zA-Z0-9,-_\+\[]{2,})/\1/'
>>
>>then there are false negatives. We have Was in the form
>>"Wa_xxx:tgl,dg2, mtl". False positives we can fixup, false negatives
>>we simply don't see. After running that in gt/:
>>
>>$ git grep ": mtl" -- drivers/gpu/drm/i915/
>>drivers/gpu/drm/i915/gt/intel_gt_pm.c: /* Wa_14017073508: mtl */
>>drivers/gpu/drm/i915/gt/intel_gt_pm.c: /* Wa_14017073508: mtl */
>>drivers/gpu/drm/i915/gt/intel_gt_pm.c: /* Wa_14017073508: mtl */
>>drivers/gpu/drm/i915/gt/intel_gt_pm.c: /* Wa_14017073508: mtl */
>>drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c: * Wa_14017073508: mtl
>>drivers/gpu/drm/i915/i915_reg.h:/* Wa_14017210380: mtl */
>>
>>I was going with the platform names to avoid the false
>>negatives and because I was entertaining the idea of only doing this for
>>latest platforms where we do have the "Wa_[[:number:]]" form
>>
>>>
>>>Maybe..
>>>
>>>Matt recently said he has this worked planned, but more
>>>importantly - I gather then that the WA lookup tool definitely
>>>does not output these strings?
>>
>>Whatever it does it's true only at the time it's called. It simply
>>tells what
>>are the platforms and steppings the Wa applies to. We can change the
>>output to whatever we want, but that is not the point.
>>Those comments get stale and bring no real value as they match 1:1
>>what the code is supposed to be doing. Several times a patch has to
>>update just that comment to "extend a workaround" to a next platform.
>>This is not always done, so we get a comment that doesn't match what is
>>supposed to be there.
>
>Tl;dr; version - lets park this until January and discuss once
>everyone is back.
I'll leave my comment here since I will be out until mid January.
>
>Longer version. I've been trying to get us talking about this a couple
>times before and I'd really like to close with an explicit consensus,
>discussion points addressed instead of skipped and just moving ahead
>with patches.
>
>References:
> 3fcf71b9-337f-6186-7b00-27cbfd116743 at linux.intel.com
> Y5j0b/bykbitCa4Q at mdroper-desk1.amr.corp.intel.com
>
>So point I wanted to discuss is whether these comments are truly
>useless or maybe they can help during review. If the tool can actually
>output them then I am leaning towards that they can be.
I consider "can the tool output xyz?" asking the wrong question.
"The tool", which is our own small python script querying a database can
output anything like that if we want to. The database has information of
what are the platforms/steppings for each the WA is known to be applied
*today*. And that can change and do change often, particularly for early
steppings and recent platforms.
>Thought is, when a patch comes for review adding a new platform,
>stepping, whatever, to an existing if condition, if it contains the
>comments reviewer can more easily spot a hyphotetical logic inversion
>error or similar. It is also trivial to check that both condition and
>comment have been updated. (So lets not be rash and remove something
>maybe useful just because it can go stale *only if* reviewers are not
>giving sufficient attention that changes are made in tandem.)
I can argue to the other side too. We don't have comments in the kernel
like
/* Add 1 to i */
i += 1;
This is exactly what these comments are doing. And they are misleading
and may introduce bugs rather than helping reviewing:
Wa_12345:tgl[a0,c0)
if (IS_TGL_GRAPHICS_STEP(STEP_A0, STEP_B0)
One might read the comment, skipping over the condition and thinking
"ok, we already extended this WA to B* steppings, which doesn't match
the code.
>From a slightly different angle - do we expect anyone reviewing
>workaround patches will cross-check against the tool? Would it be
>simpler and more efficient that they could just cross-check against
>the comment output from the tool and put into the patch by the author?
see above. Someone cross-checking the comment is cross-checking the
wrong thing. As I said, it happens more on early enabling of a platform.
>And point here to stress out is that accidental logic errors (missed
>workarounds) can be super expensive to debug in the field. Sometimes
>it can literally take _months_ for sporadic and hard to reproduce
>issues to get debugged, handed over between the teams, etc. So any way
>in which we can influence the likelyhood of that happening is
>something to weigh carefully.
yes, that's why I want to remove the comments: from my experience they
are more a source of bugs rather than helping.
>Secondary but also important - if i915 is end of line then an extra
>why we want to rip out this for ancient platforms. Is the cost/benefit
>positive there?
yep, here I agree and was my argument about using the platform names
rather than a more "catch all" regex. I think doing this only for tgl+
platforms or even dg2+ would be ok.
>As a side note, and going back to the question of what the tool can
>output. Long time ago I had an idea where we could improve all this by
>making it completely data-driven. Have the WA database inspecting tool
>output a table which could be directly pasted into code and
>interpreted by i915.
>
>For reference look at intel_workarounds_table.h in
>https://patchwork.freedesktop.org/patch/399377/?series=83580&rev=3 and
>see what you thing. That was just a sketch of the idea, not complete,
>and yes, i915 end of life point makes it moot.
now that xe is announced I can talk about this part... this was more
or less what I implemented in xe: it's a table with
"register + condition + action". There are the most common condition
checks builtin + a function hook for the more advanced ones. During
binding the driver walks the table and coalesces the entries creating
a per-register value that can be used at the proper times, depending if
they are gt, engine, context workarounds.
Lucas De Marchi
>
>Regards,
>
>Tvrtko
More information about the Intel-gfx
mailing list