[PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list

John Harrison john.c.harrison at intel.com
Fri Nov 1 20:13:49 UTC 2024


On 11/1/2024 12:40, Cavitt, Jonathan wrote:
> -----Original Message-----
> From: Harrison, John C <john.c.harrison at intel.com>
> Sent: Friday, November 1, 2024 11:46 AM
> To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
> Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
> Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
>> On 11/1/2024 11:04, Jonathan Cavitt wrote:
>>> When performing a guc_mmio_regset_write, we add all the registers in the
>>> reg_sr list to the save/restore list, but do not do the same for the
>>> whitelist registers.  Add them in.
>>>
>>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2249
>>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
>>> CC: Lucas de Marchi <lucas.demarchi at intel.com>
>>> CC: Matt Roper <matthew.d.roper at intel.com>
>>> CC: John Harrison <john.c.harrison at intel.com>
>>> CC: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>>> CC: Ashutosh Dixit <ashutosh.dixit at intel.com>
>>> ---
>>>    drivers/gpu/drm/xe/xe_guc_ads.c | 11 ++++++++++-
>>>    1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
>>> index 943146e5b460..2fc6b1ccc8fc 100644
>>> --- a/drivers/gpu/drm/xe/xe_guc_ads.c
>>> +++ b/drivers/gpu/drm/xe/xe_guc_ads.c
>>> @@ -239,9 +239,12 @@ static size_t calculate_regset_size(struct xe_gt *gt)
>>>    	enum xe_hw_engine_id id;
>>>    	unsigned int count = 0;
>>>    
>>> -	for_each_hw_engine(hwe, gt, id)
>>> +	for_each_hw_engine(hwe, gt, id) {
>>>    		xa_for_each(&hwe->reg_sr.xa, sr_idx, sr_entry)
>>>    			count++;
>>> +		xa_for_each(&hwe->reg_whitelist.xa, sr_idx, sr_entry)
>>> +			count++;
>>> +	}
>>>    
>>>    	count += ADS_REGSET_EXTRA_MAX * XE_NUM_HW_ENGINES;
>>>    
>>> @@ -727,6 +730,12 @@ static unsigned int guc_mmio_regset_write(struct xe_guc_ads *ads,
>>>    	xa_for_each(&hwe->reg_sr.xa, idx, entry)
>>>    		guc_mmio_regset_write_one(ads, regset_map, entry->reg, count++);
>>>    
>>> +	i = 0;
>>> +	xa_for_each(&hwe->reg_whitelist.xa, idx, entry)
>>> +		guc_mmio_regset_write_one(ads, regset_map,
>>> +					  RING_FORCE_TO_NONPRIV(hwe->mmio_base, i++),
>>> +					  count++);
>>> +
>> The code that actually writes to the NONPRIV registers
>> (xe_reg_sr_apply_whitelist() in xe_reg_src.c) explicitly clears all the
>> unused registers with a comment of "clear the rest in case of garbage".
> The code in xe_reg_sr_apply_whitelist calls xe_mmio_write32 to write the
> registers, whereas the code in guc_mmio_regset_write uses xe_map_memcpy_to
> internally.  While the former seems to be writing to the
> xe_mmio_adjusted_addr(mmio, reg.addr) + mmio->regs, the latter appears to be
> writing to IOSYS_MAP_INIT_OFFSET(ads_to_map(ads), guc_ads_regset_offset(ads).
>
> I'm not particularly well-versed in these functions, but it looks to me that these
> two functions write to different locations and thus would not impact each other.
> Or, in other words, I don't think the garbage we're clearing in xe_reg_sr_apply_whitelist
> is the same as the data we're writing in guc_mmio_regset_write.
No.

The apply function is writing the list of whitelisted registers into the 
whitelist registers themselves. The GuC ADS code is adding lists of 
registers to the save/restore list for an engine reset.

Specifically with regards to the NONPRIV registers, these are a set of 
registers which hold the addresses of other registers. When set, they 
allow untrusted users to access those 'other' registers which otherwise 
would be off limits. The whitelist code is setting up that list. E.g. 
adding the OA registers to the whitelist to allow applications to use 
the OA mechanisms. So it does "NONPRIV_REG(x) = OA_REG". It also does 
"NONPRIV(x+1 .. max) = NO_OP". That is to ensure all the NONPRIV 
registers are set to something valid and not uninitialised. Otherwise we 
potentially have unintended registers being whitelisted and users are 
able to access things they shouldn't. Whereas, setting them all to NO_OP 
means we are granting all users access to the NO_OP register which they 
already had access to anyway.

Completely separate to that, the GuC ADS code is creating a list of 
registers which GuC will save and restore across an engine reset. These 
are all the registers which get trashed by the reset but which are not 
saved and restored as part of a running context. The NONPRIV registers 
apparently fall into this category. So we need to tell GuC to preserve 
their content across a reset. Otherwise, after the reset, the whitelist 
will be lost. But, the reset state of those registers is 'undefined' as 
opposed to 'NO-OP' as suggested by the whitelist code. That means that 
any NONPRIV register which is not part of the reset save/restore list 
will be no longer be set to NO-OP after a reset. Instead, it will be 
giving users access to some random register again. And we do not want to 
do that.

John.


>
> -Jonathan Cavitt
>
>> If we don't trust the reset state to be valid then we need to ensure all
>> of them are saved/restored across a reset. Otherwise, that garbage can
>> come back and cause problems.
>>
>> John.
>>
>>
>>>    	for (e = extra_regs; e < extra_regs + ARRAY_SIZE(extra_regs); e++) {
>>>    		if (e->skip)
>>>    			continue;
>>



More information about the Intel-xe mailing list