[PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
John Harrison
john.c.harrison at intel.com
Fri Nov 1 20:13:49 UTC 2024
On 11/1/2024 12:40, Cavitt, Jonathan wrote:
> -----Original Message-----
> From: Harrison, John C <john.c.harrison at intel.com>
> Sent: Friday, November 1, 2024 11:46 AM
> To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
> Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
> Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
>> On 11/1/2024 11:04, Jonathan Cavitt wrote:
>>> When performing a guc_mmio_regset_write, we add all the registers in the
>>> reg_sr list to the save/restore list, but do not do the same for the
>>> whitelist registers. Add them in.
>>>
>>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2249
>>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
>>> CC: Lucas de Marchi <lucas.demarchi at intel.com>
>>> CC: Matt Roper <matthew.d.roper at intel.com>
>>> CC: John Harrison <john.c.harrison at intel.com>
>>> CC: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>>> CC: Ashutosh Dixit <ashutosh.dixit at intel.com>
>>> ---
>>> drivers/gpu/drm/xe/xe_guc_ads.c | 11 ++++++++++-
>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
>>> index 943146e5b460..2fc6b1ccc8fc 100644
>>> --- a/drivers/gpu/drm/xe/xe_guc_ads.c
>>> +++ b/drivers/gpu/drm/xe/xe_guc_ads.c
>>> @@ -239,9 +239,12 @@ static size_t calculate_regset_size(struct xe_gt *gt)
>>> enum xe_hw_engine_id id;
>>> unsigned int count = 0;
>>>
>>> - for_each_hw_engine(hwe, gt, id)
>>> + for_each_hw_engine(hwe, gt, id) {
>>> xa_for_each(&hwe->reg_sr.xa, sr_idx, sr_entry)
>>> count++;
>>> + xa_for_each(&hwe->reg_whitelist.xa, sr_idx, sr_entry)
>>> + count++;
>>> + }
>>>
>>> count += ADS_REGSET_EXTRA_MAX * XE_NUM_HW_ENGINES;
>>>
>>> @@ -727,6 +730,12 @@ static unsigned int guc_mmio_regset_write(struct xe_guc_ads *ads,
>>> xa_for_each(&hwe->reg_sr.xa, idx, entry)
>>> guc_mmio_regset_write_one(ads, regset_map, entry->reg, count++);
>>>
>>> + i = 0;
>>> + xa_for_each(&hwe->reg_whitelist.xa, idx, entry)
>>> + guc_mmio_regset_write_one(ads, regset_map,
>>> + RING_FORCE_TO_NONPRIV(hwe->mmio_base, i++),
>>> + count++);
>>> +
>> The code that actually writes to the NONPRIV registers
>> (xe_reg_sr_apply_whitelist() in xe_reg_src.c) explicitly clears all the
>> unused registers with a comment of "clear the rest in case of garbage".
> The code in xe_reg_sr_apply_whitelist calls xe_mmio_write32 to write the
> registers, whereas the code in guc_mmio_regset_write uses xe_map_memcpy_to
> internally. While the former seems to be writing to the
> xe_mmio_adjusted_addr(mmio, reg.addr) + mmio->regs, the latter appears to be
> writing to IOSYS_MAP_INIT_OFFSET(ads_to_map(ads), guc_ads_regset_offset(ads).
>
> I'm not particularly well-versed in these functions, but it looks to me that these
> two functions write to different locations and thus would not impact each other.
> Or, in other words, I don't think the garbage we're clearing in xe_reg_sr_apply_whitelist
> is the same as the data we're writing in guc_mmio_regset_write.
No.
The apply function is writing the list of whitelisted registers into the
whitelist registers themselves. The GuC ADS code is adding lists of
registers to the save/restore list for an engine reset.
Specifically with regards to the NONPRIV registers, these are a set of
registers which hold the addresses of other registers. When set, they
allow untrusted users to access those 'other' registers which otherwise
would be off limits. The whitelist code is setting up that list. E.g.
adding the OA registers to the whitelist to allow applications to use
the OA mechanisms. So it does "NONPRIV_REG(x) = OA_REG". It also does
"NONPRIV(x+1 .. max) = NO_OP". That is to ensure all the NONPRIV
registers are set to something valid and not uninitialised. Otherwise we
potentially have unintended registers being whitelisted and users are
able to access things they shouldn't. Whereas, setting them all to NO_OP
means we are granting all users access to the NO_OP register which they
already had access to anyway.
Completely separate to that, the GuC ADS code is creating a list of
registers which GuC will save and restore across an engine reset. These
are all the registers which get trashed by the reset but which are not
saved and restored as part of a running context. The NONPRIV registers
apparently fall into this category. So we need to tell GuC to preserve
their content across a reset. Otherwise, after the reset, the whitelist
will be lost. But, the reset state of those registers is 'undefined' as
opposed to 'NO-OP' as suggested by the whitelist code. That means that
any NONPRIV register which is not part of the reset save/restore list
will be no longer be set to NO-OP after a reset. Instead, it will be
giving users access to some random register again. And we do not want to
do that.
John.
>
> -Jonathan Cavitt
>
>> If we don't trust the reset state to be valid then we need to ensure all
>> of them are saved/restored across a reset. Otherwise, that garbage can
>> come back and cause problems.
>>
>> John.
>>
>>
>>> for (e = extra_regs; e < extra_regs + ARRAY_SIZE(extra_regs); e++) {
>>> if (e->skip)
>>> continue;
>>
More information about the Intel-xe
mailing list