[PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list

Cavitt, Jonathan jonathan.cavitt at intel.com
Fri Nov 1 20:47:38 UTC 2024


-----Original Message-----
From: Harrison, John C <john.c.harrison at intel.com> 
Sent: Friday, November 1, 2024 1:14 PM
To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
> 
> On 11/1/2024 12:40, Cavitt, Jonathan wrote:
> > -----Original Message-----
> > From: Harrison, John C <john.c.harrison at intel.com>
> > Sent: Friday, November 1, 2024 11:46 AM
> > To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
> > Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
> > Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
> >> On 11/1/2024 11:04, Jonathan Cavitt wrote:
> >>> When performing a guc_mmio_regset_write, we add all the registers in the
> >>> reg_sr list to the save/restore list, but do not do the same for the
> >>> whitelist registers.  Add them in.
> >>>
> >>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2249
> >>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> >>> CC: Lucas de Marchi <lucas.demarchi at intel.com>
> >>> CC: Matt Roper <matthew.d.roper at intel.com>
> >>> CC: John Harrison <john.c.harrison at intel.com>
> >>> CC: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
> >>> CC: Ashutosh Dixit <ashutosh.dixit at intel.com>
> >>> ---
> >>>    drivers/gpu/drm/xe/xe_guc_ads.c | 11 ++++++++++-
> >>>    1 file changed, 10 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
> >>> index 943146e5b460..2fc6b1ccc8fc 100644
> >>> --- a/drivers/gpu/drm/xe/xe_guc_ads.c
> >>> +++ b/drivers/gpu/drm/xe/xe_guc_ads.c
> >>> @@ -239,9 +239,12 @@ static size_t calculate_regset_size(struct xe_gt *gt)
> >>>    	enum xe_hw_engine_id id;
> >>>    	unsigned int count = 0;
> >>>    
> >>> -	for_each_hw_engine(hwe, gt, id)
> >>> +	for_each_hw_engine(hwe, gt, id) {
> >>>    		xa_for_each(&hwe->reg_sr.xa, sr_idx, sr_entry)
> >>>    			count++;
> >>> +		xa_for_each(&hwe->reg_whitelist.xa, sr_idx, sr_entry)
> >>> +			count++;
> >>> +	}
> >>>    
> >>>    	count += ADS_REGSET_EXTRA_MAX * XE_NUM_HW_ENGINES;
> >>>    
> >>> @@ -727,6 +730,12 @@ static unsigned int guc_mmio_regset_write(struct xe_guc_ads *ads,
> >>>    	xa_for_each(&hwe->reg_sr.xa, idx, entry)
> >>>    		guc_mmio_regset_write_one(ads, regset_map, entry->reg, count++);
> >>>    
> >>> +	i = 0;
> >>> +	xa_for_each(&hwe->reg_whitelist.xa, idx, entry)
> >>> +		guc_mmio_regset_write_one(ads, regset_map,
> >>> +					  RING_FORCE_TO_NONPRIV(hwe->mmio_base, i++),
> >>> +					  count++);
> >>> +
> >> The code that actually writes to the NONPRIV registers
> >> (xe_reg_sr_apply_whitelist() in xe_reg_src.c) explicitly clears all the
> >> unused registers with a comment of "clear the rest in case of garbage".
> > The code in xe_reg_sr_apply_whitelist calls xe_mmio_write32 to write the
> > registers, whereas the code in guc_mmio_regset_write uses xe_map_memcpy_to
> > internally.  While the former seems to be writing to the
> > xe_mmio_adjusted_addr(mmio, reg.addr) + mmio->regs, the latter appears to be
> > writing to IOSYS_MAP_INIT_OFFSET(ads_to_map(ads), guc_ads_regset_offset(ads).
> >
> > I'm not particularly well-versed in these functions, but it looks to me that these
> > two functions write to different locations and thus would not impact each other.
> > Or, in other words, I don't think the garbage we're clearing in xe_reg_sr_apply_whitelist
> > is the same as the data we're writing in guc_mmio_regset_write.
> No.
> 
> The apply function is writing the list of whitelisted registers into the 
> whitelist registers themselves. The GuC ADS code is adding lists of 
> registers to the save/restore list for an engine reset.
> 
> Specifically with regards to the NONPRIV registers, these are a set of 
> registers which hold the addresses of other registers. When set, they 
> allow untrusted users to access those 'other' registers which otherwise 
> would be off limits. The whitelist code is setting up that list. E.g. 
> adding the OA registers to the whitelist to allow applications to use 
> the OA mechanisms. So it does "NONPRIV_REG(x) = OA_REG". It also does 
> "NONPRIV(x+1 .. max) = NO_OP". That is to ensure all the NONPRIV 
> registers are set to something valid and not uninitialised. Otherwise we 
> potentially have unintended registers being whitelisted and users are 
> able to access things they shouldn't. Whereas, setting them all to NO_OP 
> means we are granting all users access to the NO_OP register which they 
> already had access to anyway.
> 
> Completely separate to that, the GuC ADS code is creating a list of 
> registers which GuC will save and restore across an engine reset. These 
> are all the registers which get trashed by the reset but which are not 
> saved and restored as part of a running context. The NONPRIV registers 
> apparently fall into this category. So we need to tell GuC to preserve 
> their content across a reset. Otherwise, after the reset, the whitelist 
> will be lost. But, the reset state of those registers is 'undefined' as 
> opposed to 'NO-OP' as suggested by the whitelist code. That means that 
> any NONPRIV register which is not part of the reset save/restore list 
> will be no longer be set to NO-OP after a reset. Instead, it will be 
> giving users access to some random register again. And we do not want to 
> do that.
> 

Okay.  It sounds to me that we aren't performing xe_reg_sr_apply_whitelist
during an engine reset, because that function should be setting the registers
to a defined reset state, rather than "undefined".  Should we be calling that
function during guc_mmio_regset_write?

I tested it and it didn't work on my end, but I might be missing something.
-Jonathan Cavitt

> John.
> 
> 
> >
> > -Jonathan Cavitt
> >
> >> If we don't trust the reset state to be valid then we need to ensure all
> >> of them are saved/restored across a reset. Otherwise, that garbage can
> >> come back and cause problems.
> >>
> >> John.
> >>
> >>
> >>>    	for (e = extra_regs; e < extra_regs + ARRAY_SIZE(extra_regs); e++) {
> >>>    		if (e->skip)
> >>>    			continue;
> >>
> 
> 


More information about the Intel-xe mailing list