[PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
John Harrison
john.c.harrison at intel.com
Fri Nov 1 21:52:30 UTC 2024
On 11/1/2024 13:47, Cavitt, Jonathan wrote:
> -----Original Message-----
> From: Harrison, John C <john.c.harrison at intel.com>
> Sent: Friday, November 1, 2024 1:14 PM
> To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
> Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
> Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
>> On 11/1/2024 12:40, Cavitt, Jonathan wrote:
>>> -----Original Message-----
>>> From: Harrison, John C <john.c.harrison at intel.com>
>>> Sent: Friday, November 1, 2024 11:46 AM
>>> To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; intel-xe at lists.freedesktop.org
>>> Cc: Gupta, saurabhg <saurabhg.gupta at intel.com>; Zuo, Alex <alex.zuo at intel.com>; Nerlige Ramappa, Umesh <umesh.nerlige.ramappa at intel.com>; Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas <lucas.demarchi at intel.com>; Dixit, Ashutosh <ashutosh.dixit at intel.com>
>>> Subject: Re: [PATCH] drm/xe/xe_guc_ads: Add whitelist registers to write list
>>>> On 11/1/2024 11:04, Jonathan Cavitt wrote:
>>>>> When performing a guc_mmio_regset_write, we add all the registers in the
>>>>> reg_sr list to the save/restore list, but do not do the same for the
>>>>> whitelist registers. Add them in.
>>>>>
>>>>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2249
>>>>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
>>>>> CC: Lucas de Marchi <lucas.demarchi at intel.com>
>>>>> CC: Matt Roper <matthew.d.roper at intel.com>
>>>>> CC: John Harrison <john.c.harrison at intel.com>
>>>>> CC: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>>>>> CC: Ashutosh Dixit <ashutosh.dixit at intel.com>
>>>>> ---
>>>>> drivers/gpu/drm/xe/xe_guc_ads.c | 11 ++++++++++-
>>>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
>>>>> index 943146e5b460..2fc6b1ccc8fc 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_guc_ads.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_guc_ads.c
>>>>> @@ -239,9 +239,12 @@ static size_t calculate_regset_size(struct xe_gt *gt)
>>>>> enum xe_hw_engine_id id;
>>>>> unsigned int count = 0;
>>>>>
>>>>> - for_each_hw_engine(hwe, gt, id)
>>>>> + for_each_hw_engine(hwe, gt, id) {
>>>>> xa_for_each(&hwe->reg_sr.xa, sr_idx, sr_entry)
>>>>> count++;
>>>>> + xa_for_each(&hwe->reg_whitelist.xa, sr_idx, sr_entry)
>>>>> + count++;
>>>>> + }
>>>>>
>>>>> count += ADS_REGSET_EXTRA_MAX * XE_NUM_HW_ENGINES;
>>>>>
>>>>> @@ -727,6 +730,12 @@ static unsigned int guc_mmio_regset_write(struct xe_guc_ads *ads,
>>>>> xa_for_each(&hwe->reg_sr.xa, idx, entry)
>>>>> guc_mmio_regset_write_one(ads, regset_map, entry->reg, count++);
>>>>>
>>>>> + i = 0;
>>>>> + xa_for_each(&hwe->reg_whitelist.xa, idx, entry)
>>>>> + guc_mmio_regset_write_one(ads, regset_map,
>>>>> + RING_FORCE_TO_NONPRIV(hwe->mmio_base, i++),
>>>>> + count++);
>>>>> +
>>>> The code that actually writes to the NONPRIV registers
>>>> (xe_reg_sr_apply_whitelist() in xe_reg_src.c) explicitly clears all the
>>>> unused registers with a comment of "clear the rest in case of garbage".
>>> The code in xe_reg_sr_apply_whitelist calls xe_mmio_write32 to write the
>>> registers, whereas the code in guc_mmio_regset_write uses xe_map_memcpy_to
>>> internally. While the former seems to be writing to the
>>> xe_mmio_adjusted_addr(mmio, reg.addr) + mmio->regs, the latter appears to be
>>> writing to IOSYS_MAP_INIT_OFFSET(ads_to_map(ads), guc_ads_regset_offset(ads).
>>>
>>> I'm not particularly well-versed in these functions, but it looks to me that these
>>> two functions write to different locations and thus would not impact each other.
>>> Or, in other words, I don't think the garbage we're clearing in xe_reg_sr_apply_whitelist
>>> is the same as the data we're writing in guc_mmio_regset_write.
>> No.
>>
>> The apply function is writing the list of whitelisted registers into the
>> whitelist registers themselves. The GuC ADS code is adding lists of
>> registers to the save/restore list for an engine reset.
>>
>> Specifically with regards to the NONPRIV registers, these are a set of
>> registers which hold the addresses of other registers. When set, they
>> allow untrusted users to access those 'other' registers which otherwise
>> would be off limits. The whitelist code is setting up that list. E.g.
>> adding the OA registers to the whitelist to allow applications to use
>> the OA mechanisms. So it does "NONPRIV_REG(x) = OA_REG". It also does
>> "NONPRIV(x+1 .. max) = NO_OP". That is to ensure all the NONPRIV
>> registers are set to something valid and not uninitialised. Otherwise we
>> potentially have unintended registers being whitelisted and users are
>> able to access things they shouldn't. Whereas, setting them all to NO_OP
>> means we are granting all users access to the NO_OP register which they
>> already had access to anyway.
>>
>> Completely separate to that, the GuC ADS code is creating a list of
>> registers which GuC will save and restore across an engine reset. These
>> are all the registers which get trashed by the reset but which are not
>> saved and restored as part of a running context. The NONPRIV registers
>> apparently fall into this category. So we need to tell GuC to preserve
>> their content across a reset. Otherwise, after the reset, the whitelist
>> will be lost. But, the reset state of those registers is 'undefined' as
>> opposed to 'NO-OP' as suggested by the whitelist code. That means that
>> any NONPRIV register which is not part of the reset save/restore list
>> will be no longer be set to NO-OP after a reset. Instead, it will be
>> giving users access to some random register again. And we do not want to
>> do that.
>>
> Okay. It sounds to me that we aren't performing xe_reg_sr_apply_whitelist
> during an engine reset, because that function should be setting the registers
> to a defined reset state, rather than "undefined". Should we be calling that
> function during guc_mmio_regset_write?
>
> I tested it and it didn't work on my end, but I might be missing something.
> -Jonathan Cavitt
No. The KMD does not execute any code on an engine reset. The reset is
handled by the GuC. The KMD merely gets notified that it has happened
after the fact. That is the point of giving a save/restore register list
to GuC. GuC will save the values of all the registers in the list before
it does a reset and restore them again after the reset is complete.
Therefore, any register whose value we want to be manually preserved
across an engine reset must be added to the GuC's save/restore list.
John.
>
>> John.
>>
>>
>>> -Jonathan Cavitt
>>>
>>>> If we don't trust the reset state to be valid then we need to ensure all
>>>> of them are saved/restored across a reset. Otherwise, that garbage can
>>>> come back and cause problems.
>>>>
>>>> John.
>>>>
>>>>
>>>>> for (e = extra_regs; e < extra_regs + ARRAY_SIZE(extra_regs); e++) {
>>>>> if (e->skip)
>>>>> continue;
>>
More information about the Intel-xe
mailing list