[Intel-gfx] [PATCH] drm/i915: Update workaround documentation

Lucas De Marchi lucas.demarchi at intel.com
Tue Nov 15 19:14:15 UTC 2022


On Mon, Nov 14, 2022 at 01:04:30PM -0800, Matt Roper wrote:
>On Mon, Nov 07, 2022 at 04:30:28PM -0800, Lucas De Marchi wrote:
>> There were several updates in the driver on how the workarounds are
>> handled since its documentation was written. Update the documentation to
>> reflect the current reality.
>>
>> Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
>> ---
>>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 87 +++++++++++++--------
>>  1 file changed, 56 insertions(+), 31 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>> index 3cdf5c24dbc5..0db3713c1beb 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>> @@ -17,43 +17,68 @@
>>  /**
>>   * DOC: Hardware workarounds
>>   *
>> - * This file is intended as a central place to implement most [1]_ of the
>> - * required workarounds for hardware to work as originally intended. They fall
>> - * in five basic categories depending on how/when they are applied:
>> + * This is intended as a central place to implement most [1]_ of the
>
>Your footnotes don't hook up properly anymore.  The original code had
>[1] and [2], but the new code hooks [1] to what used to be [2].

rewording this for next version

>
>Since we moved this file under gt/ a while back, I wonder if we should
>note somehow that sgunit/soc workarounds and display workarounds aren't
>expected to be part of this file?

I'm trying to make this agnostic to "this file" since it doesn't look
correct when reading the html documentation. Next version I'm rewording
some of this to just reference "central place".

>
>> + * required workarounds for hardware to work as originally intended. Hardware
>> + * workarounds are register programming documented to be executed in the driver
>> + * that fall outside of the normal programming sequences for a platform. There
>> + * are some basic categories of workarounds, depending on how/when they are
>> + * applied:
>>   *
>> - * - Workarounds that touch registers that are saved/restored to/from the HW
>> - *   context image. The list is emitted (via Load Register Immediate commands)
>> - *   everytime a new context is created.
>> - * - GT workarounds. The list of these WAs is applied whenever these registers
>> - *   revert to default values (on GPU reset, suspend/resume [2]_, etc..).
>> - * - Display workarounds. The list is applied during display clock-gating
>> - *   initialization.
>> - * - Workarounds that whitelist a privileged register, so that UMDs can manage
>> - *   them directly. This is just a special case of a MMMIO workaround (as we
>> - *   write the list of these to/be-whitelisted registers to some special HW
>> - *   registers).
>> - * - Workaround batchbuffers, that get executed automatically by the hardware
>> - *   on every HW context restore.
>> + * - Context workarounds: workarounds that touch registers that are
>> + *   saved/restored to/from the HW context image. The list is emitted (via Load
>> + *   Register Immediate commands) once when initializing the device and saved in
>> + *   the default context. That default context is then used on every context
>> + *   creation to have a "primed golden context", i.e. a context image that
>> + *   already contains the changes needed to all the registers.
>>   *
>> - * .. [1] Please notice that there are other WAs that, due to their nature,
>> - *    cannot be applied from a central place. Those are peppered around the rest
>> - *    of the code, as needed.
>> + * - Engine workarounds: the list of these WAs is applied whenever the specific
>> + *   engine is reset. It's also possible that a set of engine classes share a
>> + *   common power domain and they are reset together. This happens on some
>> + *   platforms with render and compute engines. In this case (at least) one of
>> + *   them need to keeep the workaround programming: the approach taken in the
>> + *   driver is to tie those workarounds to the first compute/render engine that
>> + *   is registered.  When executing with GuC submission, engine resets are
>> + *   outside of kernel driver control, hence the list of registers involved in
>> + *   written once, on engine initialization, and then passed to GuC, that
>> + *   saves/restores their values before/after the reset takes place. See
>> + *   ``drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c`` for reference.
>>   *
>> - * .. [2] Technically, some registers are powercontext saved & restored, so they
>> - *    survive a suspend/resume. In practice, writing them again is not too
>> - *    costly and simplifies things. We can revisit this in the future.
>> + * - GT workarounds: the list of these WAs is applied whenever these registers
>> + *   revert to their default values: on GPU reset, suspend/resume, etc.
>
>This is where the new [1] used to be referenced from.
>
>>   *
>> - * Layout
>> - * ~~~~~~
>> + * - Register whitelist: some workarounds need to be implemented in userspace,
>> + *   but need to touch privileged registers. The whitelist in the kernel
>> + *   instructs the hardware to allow the access to happen. From the kernel side,
>> + *   this is just a special case of a MMIO workaround (as we write the list of
>> + *   these to/be-whitelisted registers to some special HW registers).
>>   *
>> - * Keep things in this file ordered by WA type, as per the above (context, GT,
>> - * display, register whitelist, batchbuffer). Then, inside each type, keep the
>> - * following order:
>> + * - Workaround batchbuffers: buffers that get executed automatically by the
>> + *   hardware on every HW context restore. These buffers are created and
>> + *   programmed in the default context so the hardware always go through those
>> + *   programming sequences when switching contexts. The support for workaround
>> + *   batchbuffers is enabled these hardware mechanisms:
>>   *
>> - * - Infrastructure functions and macros
>> - * - WAs per platform in standard gen/chrono order
>> - * - Public functions to init or apply the given workaround type.
>> - */
>> + *   #. INDIRECT_CTX: A batchbuffer and an offset are provided in the default
>> + *      context, pointing the hardware to jump to that location when that offset
>> + *      is reached in the context restore. Workaround batchbuffer in the driver
>> + *      currently uses this mechanism for all platforms.
>> + *
>> + *   #. BB_PER_CTX_PTR: A batchbuffer is provided in the default context,
>> + *      pointing the hardware to a buffer to continue executing after the
>> + *      engine registers are restored in a context restore sequence. This is
>> + *      currently not used in the driver.
>> + *
>> + * - Display workarounds. The list is applied during display clock-gating
>> + *   initialization. However most of the display workarounds may be considered
>> + *   to fall under the "Others" category below.
>
>We don't have any such list today.  And if we do add one, I'm not sure
>it would happen here in gt/.  Maybe we should just add this as an extra
>"or" in the "Other" description below for now?

yeah, sounds good. I'm moving display workarounds under "Other" in the
next version.


thanks
Lucas De Marchi


More information about the Intel-gfx mailing list