[PATCH v3 08/11] drm/xe/xe2: Add workaround 18034896535

Gustavo Sousa gustavo.sousa at intel.com
Mon Apr 29 13:51:36 UTC 2024


Quoting Lucas De Marchi (2024-04-26 18:02:49-03:00)
>On Mon, Apr 08, 2024 at 10:35:42PM GMT, Balasubramani Vivekanandan wrote:
>>From: Bommu Krishnaiah <krishnaiah.bommu at intel.com>
>>
>>Add 18034896535 as driver permanent workaround.
>>
>>v2: 18034896535 and 16021540221 are two independent workarounds
>>that just happen to have the same implementation, hence keeping it.
>
>that doesn't work though. Now we get an error in LNL:
>
>        xe 0000:00:02.0: [drm] *ERROR* GT0: [GT OTHER] discarding save-restore reg e48c (clear: 00000200, set: 00000200, masked: yes, mcr: yes): ret=-22
>
>That's only visible on stepping A* in LNL. I'm seeing it on a LNL I
>have:
>
>        xe 0000:00:02.0: [drm:xe_pci_probe [xe]] XE_LUNARLAKE  64a0:0001 dgfx:0 gfx:Xe2_LPG / Xe2_HPG (20.04) media:Xe2_LPM / Xe2_HPM (20.00) display:no dma_m_s:46 tc:1 gscfi:0 cscfi:0
>        xe 0000:00:02.0: [drm:xe_pci_probe [xe]] Stepping = (G:A1, M:A0, D:**, B:**)
>
>It looks like in CI we have B0, so we don't see the issue.
>
>        xe 0000:00:02.0: [drm:xe_pci_probe [xe]] XE_LUNARLAKE  64a0:0004 dgfx:0 gfx:Xe2_LPG / Xe2_HPG (20.04) media:Xe2_LPM / Xe2_HPM (20.00) display:yes dma_m_s:46 tc:1 gscfi:0
>        xe 0000:00:02.0: [drm:xe_pci_probe [xe]] Stepping = (G:B0, M:B0, D:**, B:**)
>
>We may need to revisit 2 things in the RTP matching:
>
>        1) The stepping is non-inclusive, while the version is
>           inclusive.... this is confusing

I believe that behavior comes from the way we did range checks in i915
and is intentional.

For temporary workarounds, we usually have the stepping that fixes the
related hardware bug, so the workaround usually applies to steps before
that one. An exclusive ending value is useful here, since we can just
use the stepping with the fix and "forget" about it. If the end of the
range was inclusive, we would need to keep track of latest applicable
stepping.

Now, even if the workaround is temporary, we usually do have a defined
range of IP releases for which it is applicable and it makes sense to
have a closed end for version ranges.

--
Gustavo Sousa

>        2) Maybe we should just log rather than giving an error if the
>           value is exactly the same?
>
>           Currently we do:
>
>           /*
>            * Don't allow overwriting values: clr_bits/set_bits should be disjoint
>            * when operating in the same register
>            */
>           if (e1->clr_bits & e2->clr_bits || e1->set_bits & e2->set_bits ||
>               e1->clr_bits & e2->set_bits || e1->set_bits & e2->clr_bits)
>                   return false;
>
>        I remember this was useful to find duplicate workarounds though
>        (or some that we were implementing both as WA and tuning)
>
>For now I'm thinking to just merge the entries and add a comment.
>
>$ git diff 
>diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c
>index 9d9b7fa7a8f0..db7c7c7875c5 100644
>--- a/drivers/gpu/drm/xe/xe_wa.c
>+++ b/drivers/gpu/drm/xe/xe_wa.c
>@@ -449,12 +449,7 @@ static const struct xe_rtp_entry_sr engine_was[] = {
>           XE_RTP_RULES(GRAPHICS_VERSION(2004), FUNC(xe_rtp_match_first_render_or_compute)),
>           XE_RTP_ACTIONS(SET(ROW_CHICKEN3, XE2_EUPEND_CHK_FLUSH_DIS))
>         },
>-       { XE_RTP_NAME("16021540221"),
>-         XE_RTP_RULES(GRAPHICS_VERSION(2004), GRAPHICS_STEP(A0, B0),
>-                      FUNC(xe_rtp_match_first_render_or_compute)),
>-         XE_RTP_ACTIONS(SET(ROW_CHICKEN4, DISABLE_TDL_PUSH))
>-       },
>-       { XE_RTP_NAME("18034896535"),
>+       { XE_RTP_NAME("18034896535, 16021540221"), /* 16021540221: GRAPHICS_STEP(A0, B0) */
>           XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 2004),
>                        FUNC(xe_rtp_match_first_render_or_compute)),
>           XE_RTP_ACTIONS(SET(ROW_CHICKEN4, DISABLE_TDL_PUSH))
>
>
>Lucas De Marchi
>
>>
>>Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu at intel.com>
>>Reviewed-by: Tejas Upadhyay <tejas.upadhyay at intel.com>
>>Cc: Tejas Upadhyay <tejas.upadhyay at intel.com>
>>Cc: Matt Roper <matthew.d.roper at intel.com>
>>Reviewed-by: Matt Roper <matthew.d.roper at intel.com>
>>---
>> drivers/gpu/drm/xe/xe_wa.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>>diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c
>>index c904e55ced9c..43fac92e5d20 100644
>>--- a/drivers/gpu/drm/xe/xe_wa.c
>>+++ b/drivers/gpu/drm/xe/xe_wa.c
>>@@ -428,6 +428,11 @@ static const struct xe_rtp_entry_sr engine_was[] = {
>>                        FUNC(xe_rtp_match_first_render_or_compute)),
>>           XE_RTP_ACTIONS(SET(ROW_CHICKEN4, DISABLE_TDL_PUSH))
>>         },
>>+        { XE_RTP_NAME("18034896535"),
>>+          XE_RTP_RULES(GRAPHICS_VERSION(2004),
>>+                       FUNC(xe_rtp_match_first_render_or_compute)),
>>+          XE_RTP_ACTIONS(SET(ROW_CHICKEN4, DISABLE_TDL_PUSH))
>>+        },
>>         { XE_RTP_NAME("14019322943"),
>>           XE_RTP_RULES(GRAPHICS_VERSION(2004), GRAPHICS_STEP(A0, B0),
>>                        FUNC(xe_rtp_match_first_render_or_compute)),
>>-- 
>>2.25.1
>>


More information about the Intel-xe mailing list