[Intel-xe] [PATCH 3/3] drm/xe: Fix LRC workarounds
Lucas De Marchi
lucas.demarchi at intel.com
Tue Sep 5 18:09:42 UTC 2023
On Tue, Sep 05, 2023 at 11:01:51AM -0700, Matt Roper wrote:
>On Tue, Sep 05, 2023 at 07:31:44AM -0700, Lucas De Marchi wrote:
>> Fix 2 issues when writing LRC workarounds by copying the same handling
>> done when processing other RTP entries:
>>
>> For masked registers, it was not correctly setting the upper 16bits.
>> Differently than i915, the entry itself doesn't set the upper bits
>> for masked registers: this is done when applying them. Testing on ADL-P:
>>
>> Before:
>> [drm:xe_gt_record_default_lrcs [xe]] LRC WA rcs0 save-restore MMIOs
>> [drm:xe_gt_record_default_lrcs [xe]] REG[0x2580] = 0x00000002
>> ...
>> [drm:xe_gt_record_default_lrcs [xe]] REG[0x7018] = 0x00002000
>> [drm:xe_gt_record_default_lrcs [xe]] REG[0x7300] = 0x00000040
>> [drm:xe_gt_record_default_lrcs [xe]] REG[0x7304] = 0x00000200
>>
>> After:
>> [drm:xe_gt_record_default_lrcs [xe]] LRC WA rcs0 save-restore MMIOs
>> [drm:xe_gt_record_default_lrcs [xe]] REG[0x2580] = 0x00060002
>> ...
>> [drm:xe_gt_record_default_lrcs [xe]] REG[0x7018] = 0x20002000
>> [drm:xe_gt_record_default_lrcs [xe]] REG[0x7300] = 0x00400040
>> [drm:xe_gt_record_default_lrcs [xe]] REG[0x7304] = 0x02000200
>>
>> All of these registers are masked registers, so writing to them without
>> the relevant bits in the upper 16b doesn't have any effect.
>>
>> Also, this adds support to regular registers; previously it was assumed
>> that LRC entries would only contain masked registers. However this is
>> not true. 0x6604 is not a masked register, but used in workarounds for
>> e.g. ADL-P. See commit 28cf243a341a ("drm/i915/gt: Fix context
>> workarounds with non-masked regs"). In the same test with ADL-P as
>> above:
>>
>> Before:
>> [drm:xe_gt_record_default_lrcs [xe]] REG[0x6604] = 0xe0000000
>> After:
>> [drm:xe_gt_record_default_lrcs [xe]] REG[0x6604] = 0xe0efef6f
>>
>> As can be seen, now it will read what was in the register rather than
>> completely overwrite the other bits.
>>
>> Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_gt.c | 42 +++++++++++++++++++++++++++++++-------
>> 1 file changed, 35 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
>> index bd307770a620..6dd3cb5d8246 100644
>> --- a/drivers/gpu/drm/xe/xe_gt.c
>> +++ b/drivers/gpu/drm/xe/xe_gt.c
>> @@ -114,12 +114,21 @@ static int emit_nop_job(struct xe_gt *gt, struct xe_exec_queue *q)
>> return 0;
>> }
>>
>> +/*
>> + * Convert back from encoded value to type-safe, only to be used when reg.mcr
>> + * is true
>> + */
>> +static struct xe_reg_mcr to_xe_reg_mcr(const struct xe_reg reg)
>> +{
>> + return (const struct xe_reg_mcr){.__reg.raw = reg.raw };
>> +}
>> +
>> static int emit_wa_job(struct xe_gt *gt, struct xe_exec_queue *q)
>> {
>> struct xe_device *xe = gt_to_xe(gt);
>> struct xe_reg_sr *sr = &q->hwe->reg_lrc;
>> struct xe_reg_sr_entry *entry;
>> - unsigned long reg;
>> + unsigned long idx;
>> struct xe_sched_job *job;
>> struct xe_bb *bb;
>> struct dma_fence *fence;
>> @@ -130,18 +139,37 @@ static int emit_wa_job(struct xe_gt *gt, struct xe_exec_queue *q)
>> if (IS_ERR(bb))
>> return PTR_ERR(bb);
>>
>> - xa_for_each(&sr->xa, reg, entry)
>> + xa_for_each(&sr->xa, idx, entry)
>> ++count;
>>
>> if (count) {
>> drm_dbg(&xe->drm, "LRC WA %s save-restore MMIOs\n", sr->name);
>>
>> bb->cs[bb->len++] = MI_LOAD_REGISTER_IMM(count);
>> - xa_for_each(&sr->xa, reg, entry) {
>> - bb->cs[bb->len++] = reg;
>> - bb->cs[bb->len++] = entry->set_bits;
>> - drm_dbg(&xe->drm, "REG[0x%lx] = 0x%08x", reg,
>> - entry->set_bits);
>> +
>> + xa_for_each(&sr->xa, idx, entry) {
>> + struct xe_reg reg = entry->reg;
>> + struct xe_reg_mcr reg_mcr = to_xe_reg_mcr(reg);
>> + u32 val;
>> +
>> + bb->cs[bb->len++] = reg.addr;
>> +
>> + /*
>> + * Skip reading the register if it's not really needed
>> + */
>> + if (reg.masked)
>> + val = entry->clr_bits << 16;
>> + else if (entry->clr_bits + 1)
>> + val = (reg.mcr ?
>> + xe_gt_mcr_unicast_read_any(gt, reg_mcr) :
>> + xe_mmio_read32(gt, reg)) & (~entry->clr_bits);
>
>I'm a little bit worried about whether this will continue to work in the
>long term. There are definitely some registers out there that we can't
>read properly from the CPU. Maybe none of our current LRC workarounds
>target such registers, but they do exist (I have a vague memory of
>hitting this on some old pre-production PVC LRC workarounds I think).
but were them also non-masked?
>We can probably go with this CPU read for now, but maybe we should leave
>a comment here that we might need to switch to a GPU-based approach at
>some point in the future if we run into CPU inaccessible registers.
>
>If we wind up needing to change the approach down the road, I think
>there are probably a handful of possible approaches:
>
> * Save the default_lrc without applying the LRC workarounds initially.
> Read the in-memory contents of the LRC to find the original register
> value. Then submit the LRC batch and re-save the LRC with the
> workarounds applied.
> * Submit a preliminary batchbuffer with MI_SRM instructions to save the
> hw-default register values to a hwsp location that we can then pick
> up and use when crafting the MI_LRI batch. Or update in memory and
> use MI_LRM instead.
> * Use MI_MATH to perform the desired bitwise operations on register
> bits directly.
yeah, we've had the same concern when doing similar fix in i915. From
all of the options, I think the one doing MI_MATH for the very few
registers in the category [ non-masked, can't-be-read-from-cpu ] would
be appropriate. I can take a look on doing that later, but I think
it's more urgent to fix the outstanding issue we have right now of
basically not really applying most of the LRC WAs.
>
>Anyway, we can deal with that later if/when it becomes necessary. For
>now,
>
>Reviewed-by: Matt Roper <matthew.d.roper at intel.com>
thanks
Lucas De Marchi
>
>
>Matt
>
>> + else
>> + val = 0;
>> +
>> + val |= entry->set_bits;
>> + bb->cs[bb->len++] = val;
>> +
>> + drm_dbg(&xe->drm, "REG[0x%x] = 0x%08x", reg.addr, val);
>> }
>> }
>>
>> --
>> 2.40.1
>>
>
>--
>Matt Roper
>Graphics Software Engineer
>Linux GPU Platform Enablement
>Intel Corporation
More information about the Intel-xe
mailing list