[PATCH v3 1/4] drm/xe/bmg: Update Wa_14022085890

Lucas De Marchi lucas.demarchi at intel.com
Thu Jun 12 06:49:18 UTC 2025


On Wed, Jun 11, 2025 at 11:31:35PM +0000, Stuart Summers wrote:
>On Wed, 2025-06-11 at 18:02 -0500, Lucas De Marchi wrote:
>> On Fri, Jun 06, 2025 at 07:09:39PM -0500, Lucas De Marchi wrote:
>> > On Wed, Jun 04, 2025 at 12:06:52AM +0000, Stuart Summers wrote:
>> > > On Tue, 2025-06-03 at 13:31 -0700, Belgaumkar, Vinay wrote:
>> > > >
>> > > > On 6/3/2025 12:31 PM, Summers, Stuart wrote:
>> > > > > On Tue, 2025-06-03 at 11:41 -0700, Belgaumkar, Vinay wrote:
>> > > > > > On 6/3/2025 10:36 AM, Summers, Stuart wrote:
>> > > > > > > On Mon, 2025-06-02 at 16:44 -0700, Vinay Belgaumkar
>> > > > > > > wrote:
>> > > > > > > > Set GT min frequency to 1200Mhz once driver load is
>> > > > > > > > complete.
>> > > > > > > >
>> > > > > > > > v2: Review comments (Rodrigo)
>> > > > > > > > v3: Apply Wa earlier so user_req_min is not clobbered.
>> > > > > > > >
>> > > > > > > > Cc: Matt Roper <matthew.d.roper at intel.com>
>> > > > > > > > Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> > > > > > > > Signed-off-by: Vinay Belgaumkar
>> > > > > > > > <vinay.belgaumkar at intel.com>
>> > > > > > > > ---
>> > > > > > > >    drivers/gpu/drm/xe/xe_guc_pc.c     | 5 +++++
>> > > > > > > >    drivers/gpu/drm/xe/xe_wa_oob.rules | 1 +
>> > > > > > > >    2 files changed, 6 insertions(+)
>> > > > > > > >
>> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_guc_pc.c
>> > > > > > > > b/drivers/gpu/drm/xe/xe_guc_pc.c
>> > > > > > > > index 18c623992035..cb0563494fcc 100644
>> > > > > > > > --- a/drivers/gpu/drm/xe/xe_guc_pc.c
>> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_guc_pc.c
>> > > > > > > > @@ -51,6 +51,7 @@
>> > > > > > > >   
>> > > > > > > >    #define LNL_MERT_FREQ_CAP      800
>> > > > > > > >    #define BMG_MERT_FREQ_CAP      2133
>> > > > > > > > +#define BMG_MIN_FREQ           1200
>> > > > > > > >   
>> > > > > > > >    #define SLPC_RESET_TIMEOUT_MS 5 /* roughly 5ms, but
>> > > > > > > > no
>> > > > > > > > need for
>> > > > > > > > precision */
>> > > > > > > >    #define SLPC_RESET_EXTENDED_TIMEOUT_MS 1000 /* To be
>> > > > > > > > used
>> > > > > > > > only
>> > > > > > > > at
>> > > > > > > > pc_start */
>> > > > > > > > @@ -843,6 +844,9 @@ static int
>> > > > > > > > pc_adjust_freq_bounds(struct
>> > > > > > > > xe_guc_pc
>> > > > > > > > *pc)
>> > > > > > > >           if (pc_get_min_freq(pc) > pc->rp0_freq)
>> > > > > > > >                   ret = pc_set_min_freq(pc, pc-
>> > > > > > > > >rp0_freq);
>> > > > > > > >   
>> > > > > > > > +       if (XE_WA(pc_to_gt(pc), 14022085890))
>> > > > > > > > +               ret = pc_set_min_freq(pc,
>> > > > > > > > max(BMG_MIN_FREQ,
>> > > > > > > > pc_get_min_freq(pc)));
>> > > > > > > Do we want this above the get_min_freq() call above?
>> > > > > > > Seems like
>> > > > > > > that
>> > > > > > > would further limit which is what we'd want in that case
>> > > > > > > right?
>> > > > > > No, this is during SLPC init, we don't need to read the min
>> > > > > > freq
>> > > > > > at
>> > > > > > this
>> > > > > > point. We will adjust it later in the flow if there was a
>> > > > > > user
>> > > > > > request.
>> > > >
>> > > > This ensures min frequency is set to 1200. That is required by
>> > > > the
>> > > > WA.
>> > > > If there was no previous user request (like before a GT reset),
>> > > > the
>> > > > min
>> > > > will be set to 1200 instead of the usual RPe (which is around
>> > > > 500).
>> > >
>> > > Ok makes sense to me. I did confirm the workaround frequency
>> > > called out
>> > > in bspec matches here:
>> > > Reviewed-by: Stuart Summers <stuart.summers at intel.com>
>> >
>> > but why do we do 2 rmw instead of 1? Something like this
>> >
>> >         u32 min_freq;
>> >         ...
>> >
>> >         /*
>> >          * Same thing happens for Server platforms where min is
>> > listed as
>> >          * RPMax
>> >          */
>> >         min_freq = min(pc_get_min_freq(pc), pc->rp0_freq);
>> >
>> >         /* But shouldn't go to less than BMG_MIN_FREQ on affected
>> > platforms */
>> >         if (XE_WA(pc_to_gt(pc), 14022085890))
>> >                 min_freq = max(BMG_MIN_FREQ, min_freq);
>> >
>> >         ret = pc_set_min_freq(pc, min_freq);
>>
>> humn... but that would mean we would always have a call to
>> pc_set_min_freq(), even if not needed, i.e.
>> pc_get_min_freq() <= pc->rp0_freq and for platforms affected by
>> 14022085890, pc_get_min_freq() >= BMG_MIN_FREQ.
>>
>> still trying to figure out one thing though: in BMG this will apply
>> only
>> to the main gt. media gt will not set it and since pc is per gt, it's
>> not the correct thing to do.
>>
>> Matt Roper, what about add the diff below?  Untested, will do later
>> today.
>>
>> > diff --git a/drivers/gpu/drm/xe/xe_guc_pc.c
>> > b/drivers/gpu/drm/xe/xe_guc_pc.c
>> > index 3861e947387fd..04e112c751096 100644
>> > --- a/drivers/gpu/drm/xe/xe_guc_pc.c
>> > +++ b/drivers/gpu/drm/xe/xe_guc_pc.c
>> > @@ -818,6 +818,7 @@ void xe_guc_pc_init_early(struct xe_guc_pc *pc)
>> >  
>> >  static int pc_adjust_freq_bounds(struct xe_guc_pc *pc)
>> >  {
>> > +       struct xe_tile *tile = gt_to_tile(pc_to_gt(pc));
>> >         int ret;
>> >  
>> >         lockdep_assert_held(&pc->freq_lock);
>> > @@ -844,7 +845,7 @@ static int pc_adjust_freq_bounds(struct
>> > xe_guc_pc *pc)
>> >         if (pc_get_min_freq(pc) > pc->rp0_freq)
>> >                 ret = pc_set_min_freq(pc, pc->rp0_freq);
>> >  
>> > -       if (XE_WA(pc_to_gt(pc), 14022085890))
>> > +       if (XE_WA(tile->primary_gt, 14022085890))
>
>I didn't see anything in bspec that explicitly called out media vs non-
>media. Where are you getting that here?

Because there isn't.... the issue is in the soc and not really tied to
the gt. We are using a OOB WA tied to GRAPHICS_VERSION(2001) because we
don't have a way to tie it to the soc. However that means it doesn't
apply to the media gt in BMG since there's only media.

One alternative I thought was to tie it to the platform. However there
are skus of the same platform that don't need this WA. So... for now
I think we can check for "is this workaround needed" in the primary_gt
and then enable in all GTs.

with this patch alone:

	# tail /sys/bus/pci/devices/0000\:03\:00.0/tile0/gt*/freq0/{min,cur}*
	==> /sys/bus/pci/devices/0000:03:00.0/tile0/gt0/freq0/min_freq <==
	1200

	==> /sys/bus/pci/devices/0000:03:00.0/tile0/gt1/freq0/min_freq <==
	400

	==> /sys/bus/pci/devices/0000:03:00.0/tile0/gt0/freq0/cur_freq <==
	1200

	==> /sys/bus/pci/devices/0000:03:00.0/tile0/gt1/freq0/cur_freq <==
	400

And now with the fix proposed:

	# tail /sys/bus/pci/devices/0000\:03\:00.0/tile0/gt*/freq0/{min,cur}*
	==> /sys/bus/pci/devices/0000:03:00.0/tile0/gt0/freq0/min_freq <==
	1200

	==> /sys/bus/pci/devices/0000:03:00.0/tile0/gt1/freq0/min_freq <==
	1200

	==> /sys/bus/pci/devices/0000:03:00.0/tile0/gt0/freq0/cur_freq <==
	1200

	==> /sys/bus/pci/devices/0000:03:00.0/tile0/gt1/freq0/cur_freq <==
	1200

I think Matt Atwood is working on allowing OOB WA to be per-soc too, so
this may be coming soon. Leaving the comment in the rules file will be
good.

Let me send this patch with additional changes for a CI check.

Lucas De Marchi


More information about the Intel-xe mailing list