[PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
Cavitt, Jonathan
jonathan.cavitt at intel.com
Fri Nov 1 22:22:00 UTC 2024
-----Original Message-----
From: Harrison, John C <john.c.harrison at intel.com>
Sent: Friday, November 1, 2024 3:15 PM
To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
>
> On 11/1/2024 15:00, Cavitt, Jonathan wrote:
> > -----Original Message-----
> > From: Harrison, John C <john.c.harrison at intel.com>
> > Sent: Friday, November 1, 2024 2:53 PM
> > To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
> > Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
> > Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
> >> On 11/1/2024 13:47, Cavitt, Jonathan wrote:
> >>> -----Original Message-----
> >>> From: Harrison, John C <john.c.harrison at intel.com>
> >>> Sent: Friday, November 1, 2024 1:14 PM
> >>> To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
> >>> Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
> >>> Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
> >>>> On 11/1/2024 12:40, Cavitt, Jonathan wrote:
> >>>>> -----Original Message-----
> >>>>> From: Harrison, John C <john.c.harrison at intel.com>
> >>>>> Sent: Friday, November 1, 2024 11:46 AM
> >>>>> To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
> >>>>> Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
> >>>>> Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
> >>>>>> On 11/1/2024 11:04, Jonathan Cavitt wrote:
> >>>>>>> When performing a guc_mmio_regset_write, we add all the registers in the
> >>>>>>> reg_sr list to the save/restore list, but do not do the same for the
> >>>>>>> whitelist registers. Add them in.
> >>>>>>>
> >>>>>>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2249
> >>>>>>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> >>>>>>> CC: Lucas de Marchi <lucas.demarchi at intel.com>
> >>>>>>> CC: Matt Roper <matthew.d.roper at intel.com>
> >>>>>>> CC: John Harrison <john.c.harrison at intel.com>
> >>>>>>> CC: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
> >>>>>>> CC: Ashutosh Dixit <ashutosh.dixit at intel.com>
> >>>>>>> ---
> >>>>>>> drivers/gpu/drm/xe/xe_guc_ads.c | 11 ++++++++++-
> >>>>>>> 1 file changed, 10 insertions(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
> >>>>>>> index 943146e5b460..2fc6b1ccc8fc 100644
> >>>>>>> --- a/drivers/gpu/drm/xe/xe_guc_ads.c
> >>>>>>> +++ b/drivers/gpu/drm/xe/xe_guc_ads.c
> >>>>>>> @@ -239,9 +239,12 @@ static size_t calculate_regset_size(struct xe_gt *gt)
> >>>>>>> enum xe_hw_engine_id id;
> >>>>>>> unsigned int count = 0;
> >>>>>>>
> >>>>>>> - for_each_hw_engine(hwe, gt, id)
> >>>>>>> + for_each_hw_engine(hwe, gt, id) {
> >>>>>>> xa_for_each(&hwe->reg_sr.xa, sr_idx, sr_entry)
> >>>>>>> count++;
> >>>>>>> + xa_for_each(&hwe->reg_whitelist.xa, sr_idx, sr_entry)
> >>>>>>> + count++;
> >>>>>>> + }
> >>>>>>>
> >>>>>>> count += ADS_REGSET_EXTRA_MAX * XE_NUM_HW_ENGINES;
> >>>>>>>
> >>>>>>> @@ -727,6 +730,12 @@ static unsigned int guc_mmio_regset_write(struct xe_guc_ads *ads,
> >>>>>>> xa_for_each(&hwe->reg_sr.xa, idx, entry)
> >>>>>>> guc_mmio_regset_write_one(ads, regset_map, entry->reg, count++);
> >>>>>>>
> >>>>>>> + i = 0;
> >>>>>>> + xa_for_each(&hwe->reg_whitelist.xa, idx, entry)
> >>>>>>> + guc_mmio_regset_write_one(ads, regset_map,
> >>>>>>> + RING_FORCE_TO_NONPRIV(hwe->mmio_base, i++),
> >>>>>>> + count++);
> >>>>>>> +
> >>>>>> The code that actually writes to the NONPRIV registers
> >>>>>> (xe_reg_sr_apply_whitelist() in xe_reg_src.c) explicitly clears all the
> >>>>>> unused registers with a comment of "clear the rest in case of garbage".
> >>>>> The code in xe_reg_sr_apply_whitelist calls xe_mmio_write32 to write the
> >>>>> registers, whereas the code in guc_mmio_regset_write uses xe_map_memcpy_to
> >>>>> internally. While the former seems to be writing to the
> >>>>> xe_mmio_adjusted_addr(mmio, reg.addr) + mmio->regs, the latter appears to be
> >>>>> writing to IOSYS_MAP_INIT_OFFSET(ads_to_map(ads), guc_ads_regset_offset(ads).
> >>>>>
> >>>>> I'm not particularly well-versed in these functions, but it looks to me that these
> >>>>> two functions write to different locations and thus would not impact each other.
> >>>>> Or, in other words, I don't think the garbage we're clearing in xe_reg_sr_apply_whitelist
> >>>>> is the same as the data we're writing in guc_mmio_regset_write.
> >>>> No.
> >>>>
> >>>> The apply function is writing the list of whitelisted registers into the
> >>>> whitelist registers themselves. The GuC ADS code is adding lists of
> >>>> registers to the save/restore list for an engine reset.
> >>>>
> >>>> Specifically with regards to the NONPRIV registers, these are a set of
> >>>> registers which hold the addresses of other registers. When set, they
> >>>> allow untrusted users to access those 'other' registers which otherwise
> >>>> would be off limits. The whitelist code is setting up that list. E.g.
> >>>> adding the OA registers to the whitelist to allow applications to use
> >>>> the OA mechanisms. So it does "NONPRIV_REG(x) = OA_REG". It also does
> >>>> "NONPRIV(x+1 .. max) = NO_OP". That is to ensure all the NONPRIV
> >>>> registers are set to something valid and not uninitialised. Otherwise we
> >>>> potentially have unintended registers being whitelisted and users are
> >>>> able to access things they shouldn't. Whereas, setting them all to NO_OP
> >>>> means we are granting all users access to the NO_OP register which they
> >>>> already had access to anyway.
> >>>>
> >>>> Completely separate to that, the GuC ADS code is creating a list of
> >>>> registers which GuC will save and restore across an engine reset. These
> >>>> are all the registers which get trashed by the reset but which are not
> >>>> saved and restored as part of a running context. The NONPRIV registers
> >>>> apparently fall into this category. So we need to tell GuC to preserve
> >>>> their content across a reset. Otherwise, after the reset, the whitelist
> >>>> will be lost. But, the reset state of those registers is 'undefined' as
> >>>> opposed to 'NO-OP' as suggested by the whitelist code. That means that
> >>>> any NONPRIV register which is not part of the reset save/restore list
> >>>> will be no longer be set to NO-OP after a reset. Instead, it will be
> >>>> giving users access to some random register again. And we do not want to
> >>>> do that.
> >>>>
> >>> Okay. It sounds to me that we aren't performing xe_reg_sr_apply_whitelist
> >>> during an engine reset, because that function should be setting the registers
> >>> to a defined reset state, rather than "undefined". Should we be calling that
> >>> function during guc_mmio_regset_write?
> >>>
> >>> I tested it and it didn't work on my end, but I might be missing something.
> >>> -Jonathan Cavitt
> >> No. The KMD does not execute any code on an engine reset. The reset is
> >> handled by the GuC. The KMD merely gets notified that it has happened
> >> after the fact. That is the point of giving a save/restore register list
> >> to GuC. GuC will save the values of all the registers in the list before
> >> it does a reset and restore them again after the reset is complete.
> >> Therefore, any register whose value we want to be manually preserved
> >> across an engine reset must be added to the GuC's save/restore list.
> > AFAICT, that's what I'm doing in this patch, so could you please
> > clarify what it is I need to do differently from what is currently
> > present in the patch?
> > -Jonathan Cavitt
> You patch is only adding the NONPRIV registers which have been used as
> opposed to adding all of them. The foreach loop is iterating over the
> list of whitelisted registers (i.e. the target registers that are
> written into the NON_PRIV(x) registers) and adds a NONPRIV register to
> the save/restore list for each entry found in the whitelist. So if there
> were three entries in the whitelist, it would add NONPRIV(0..2).
> Therefore NONPRIV(3..11) are not added to the save/restore list and will
> be trashed on an engine reset.
>
> The original patch was simply looping over 0..MAX_NONPRIV and would have
> added all the NONPRIV registers to the save/restore list.
>
> The current patch is an optimisation to only add the in-use registers.
> My concern is that optimisation is not valid and the original version
> was actually necessary.
I gave it some thought after sending my prior reply and had a feeling
that would be your response. Okay, I'll send an update that iterates
over all of MAX_NONPRIV.
-Jonathan Cavitt
>
> John.
>
> >
> >> John.
> >>
> >>>> John.
> >>>>
> >>>>
> >>>>> -Jonathan Cavitt
> >>>>>
> >>>>>> If we don't trust the reset state to be valid then we need to ensure all
> >>>>>> of them are saved/restored across a reset. Otherwise, that garbage can
> >>>>>> come back and cause problems.
> >>>>>>
> >>>>>> John.
> >>>>>>
> >>>>>>
> >>>>>>> for (e = extra_regs; e < extra_regs + ARRAY_SIZE(extra_regs); e++) {
> >>>>>>> if (e->skip)
> >>>>>>> continue;
> >>
>
>
More information about the Intel-xe
mailing list