[PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
Cavitt, Jonathan
jonathan.cavitt at intel.com
Fri Nov 1 20:47:38 UTC 2024
-----Original Message-----
From: Harrison, John C <john.c.harrison at intel.com>
Sent: Friday, November 1, 2024 1:14 PM
To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
>
> On 11/1/2024 12:40, Cavitt, Jonathan wrote:
> > -----Original Message-----
> > From: Harrison, John C <john.c.harrison at intel.com>
> > Sent: Friday, November 1, 2024 11:46 AM
> > To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
> > Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
> > Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
> >> On 11/1/2024 11:04, Jonathan Cavitt wrote:
> >>> When performing a guc_mmio_regset_write, we add all the registers in the
> >>> reg_sr list to the save/restore list, but do not do the same for the
> >>> whitelist registers. Add them in.
> >>>
> >>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2249
> >>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> >>> CC: Lucas de Marchi <lucas.demarchi at intel.com>
> >>> CC: Matt Roper <matthew.d.roper at intel.com>
> >>> CC: John Harrison <john.c.harrison at intel.com>
> >>> CC: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
> >>> CC: Ashutosh Dixit <ashutosh.dixit at intel.com>
> >>> ---
> >>> drivers/gpu/drm/xe/xe_guc_ads.c | 11 ++++++++++-
> >>> 1 file changed, 10 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
> >>> index 943146e5b460..2fc6b1ccc8fc 100644
> >>> --- a/drivers/gpu/drm/xe/xe_guc_ads.c
> >>> +++ b/drivers/gpu/drm/xe/xe_guc_ads.c
> >>> @@ -239,9 +239,12 @@ static size_t calculate_regset_size(struct xe_gt *gt)
> >>> enum xe_hw_engine_id id;
> >>> unsigned int count = 0;
> >>>
> >>> - for_each_hw_engine(hwe, gt, id)
> >>> + for_each_hw_engine(hwe, gt, id) {
> >>> xa_for_each(&hwe->reg_sr.xa, sr_idx, sr_entry)
> >>> count++;
> >>> + xa_for_each(&hwe->reg_whitelist.xa, sr_idx, sr_entry)
> >>> + count++;
> >>> + }
> >>>
> >>> count += ADS_REGSET_EXTRA_MAX * XE_NUM_HW_ENGINES;
> >>>
> >>> @@ -727,6 +730,12 @@ static unsigned int guc_mmio_regset_write(struct xe_guc_ads *ads,
> >>> xa_for_each(&hwe->reg_sr.xa, idx, entry)
> >>> guc_mmio_regset_write_one(ads, regset_map, entry->reg, count++);
> >>>
> >>> + i = 0;
> >>> + xa_for_each(&hwe->reg_whitelist.xa, idx, entry)
> >>> + guc_mmio_regset_write_one(ads, regset_map,
> >>> + RING_FORCE_TO_NONPRIV(hwe->mmio_base, i++),
> >>> + count++);
> >>> +
> >> The code that actually writes to the NONPRIV registers
> >> (xe_reg_sr_apply_whitelist() in xe_reg_src.c) explicitly clears all the
> >> unused registers with a comment of "clear the rest in case of garbage".
> > The code in xe_reg_sr_apply_whitelist calls xe_mmio_write32 to write the
> > registers, whereas the code in guc_mmio_regset_write uses xe_map_memcpy_to
> > internally. While the former seems to be writing to the
> > xe_mmio_adjusted_addr(mmio, reg.addr) + mmio->regs, the latter appears to be
> > writing to IOSYS_MAP_INIT_OFFSET(ads_to_map(ads), guc_ads_regset_offset(ads).
> >
> > I'm not particularly well-versed in these functions, but it looks to me that these
> > two functions write to different locations and thus would not impact each other.
> > Or, in other words, I don't think the garbage we're clearing in xe_reg_sr_apply_whitelist
> > is the same as the data we're writing in guc_mmio_regset_write.
> No.
>
> The apply function is writing the list of whitelisted registers into the
> whitelist registers themselves. The GuC ADS code is adding lists of
> registers to the save/restore list for an engine reset.
>
> Specifically with regards to the NONPRIV registers, these are a set of
> registers which hold the addresses of other registers. When set, they
> allow untrusted users to access those 'other' registers which otherwise
> would be off limits. The whitelist code is setting up that list. E.g.
> adding the OA registers to the whitelist to allow applications to use
> the OA mechanisms. So it does "NONPRIV_REG(x) = OA_REG". It also does
> "NONPRIV(x+1 .. max) = NO_OP". That is to ensure all the NONPRIV
> registers are set to something valid and not uninitialised. Otherwise we
> potentially have unintended registers being whitelisted and users are
> able to access things they shouldn't. Whereas, setting them all to NO_OP
> means we are granting all users access to the NO_OP register which they
> already had access to anyway.
>
> Completely separate to that, the GuC ADS code is creating a list of
> registers which GuC will save and restore across an engine reset. These
> are all the registers which get trashed by the reset but which are not
> saved and restored as part of a running context. The NONPRIV registers
> apparently fall into this category. So we need to tell GuC to preserve
> their content across a reset. Otherwise, after the reset, the whitelist
> will be lost. But, the reset state of those registers is 'undefined' as
> opposed to 'NO-OP' as suggested by the whitelist code. That means that
> any NONPRIV register which is not part of the reset save/restore list
> will be no longer be set to NO-OP after a reset. Instead, it will be
> giving users access to some random register again. And we do not want to
> do that.
>
Okay. It sounds to me that we aren't performing xe_reg_sr_apply_whitelist
during an engine reset, because that function should be setting the registers
to a defined reset state, rather than "undefined". Should we be calling that
function during guc_mmio_regset_write?
I tested it and it didn't work on my end, but I might be missing something.
-Jonathan Cavitt
> John.
>
>
> >
> > -Jonathan Cavitt
> >
> >> If we don't trust the reset state to be valid then we need to ensure all
> >> of them are saved/restored across a reset. Otherwise, that garbage can
> >> come back and cause problems.
> >>
> >> John.
> >>
> >>
> >>> for (e = extra_regs; e < extra_regs + ARRAY_SIZE(extra_regs); e++) {
> >>> if (e->skip)
> >>> continue;
> >>
>
>
More information about the Intel-xe
mailing list