[PATCH] drm/xe/dg2: Fix number of mocs entries

Matt Roper matthew.d.roper at intel.com
Thu Feb 22 21:16:39 UTC 2024


On Thu, Feb 22, 2024 at 12:42:03PM -0800, Lucas De Marchi wrote:
> While investigating why xe_mocs kunit was failing for DG2, Matt Roper
> noticed that LNCFCMOCS[31] is actually RO for DG2. It would be a spec
> bug to document a register to be programmed that is actually RO. However
> upon further inspection, it seems like we should not be programming the
> other values. Early DG2 SKUs documented that the entire set of
> LNCFCMOCS registers needed to be programmed, but this is not true
> anymore.

Wait...it's not the bspec that required us to program every index, but
rather the Linux-specific need to ensure we don't break ABI in the
future.  Although rare, it's consisdered legal for the hardware team to
add additional documented MOCS index values to the table as long as they
aren't changing/removing any of the pre-existing values.  If we have
(buggy) userspace today that's already using an undocumented MOCS index
and operating properly (due to luck), adding the new "official" settings
based on a future bspec update could regress userspace if those new
settings do something like reduce the coherency.  So to guard against
this, we always need to program every undocumented index to maximum
caching, minimum coherency.  That way, if [buggy] userspace can operate
properly with those settings, it should continue to operate in a
functionally correct manner if the index suddenly becomes more coherent
in a future spec update.

> 
> Do like is done in MTL: the additional registers are there, but we are
> not supposed to program them, not even with the unused_index value.

It looks like this was actually a mistake on MTL that was overlooked;
the bspec documents only 16 entries, but the hardware supports 32, so we
should be explicitly programming all the unused ones as mentioned above
to protect against future table updates.


At this point I think we're far enough along that it's extremely
unlikely these MOCS tables for Xe1 platforms like DG2 and MTL are going
to have any further spec additions, so this MTL mistake wound up not
being a big problem for us.  But replacing XELP_NUM_MOCS_ENTRIES with
XEHP_NUM_MOCS_ENTRIES on DG2 is going to extend that mistake farther and
make it more likely for someone to copy/paste the same logic in future
platforms.

The fact that DG2's entry #63 is RO and can't be reprogrammed by the KMD
also means that it's 100% impossible for a future bspec update to
request different values for it.  So whether or not we try to write the
"unused entry" value into it doesn't really matter (it's easiest to just
keep doing so, especially due to the way two MOCS indices are packed
into the same register), but the test should be updated to not try to
verify that last entry.


Matt

> 
> While at it, fix MTL define to follow the same alignment as others, with
> tabs.
> 
> Bspec: 44988, 55267
> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1253
> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1233
> Cc: Matt Roper <matthew.d.roper at intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_mocs.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_mocs.c b/drivers/gpu/drm/xe/xe_mocs.c
> index 609d997b3e9b0..bb28c26a46d5b 100644
> --- a/drivers/gpu/drm/xe/xe_mocs.c
> +++ b/drivers/gpu/drm/xe/xe_mocs.c
> @@ -71,8 +71,9 @@ struct xe_mocs_info {
>  
>  /* Helper defines */
>  #define XELP_NUM_MOCS_ENTRIES	64  /* 63-64 are reserved, but configured. */
> +#define XEHP_NUM_MOCS_ENTRIES	4
>  #define PVC_NUM_MOCS_ENTRIES	3
> -#define MTL_NUM_MOCS_ENTRIES    16
> +#define MTL_NUM_MOCS_ENTRIES	16
>  #define XE2_NUM_MOCS_ENTRIES	16
>  
>  /* (e)LLC caching options */
> @@ -401,7 +402,7 @@ static unsigned int get_mocs_settings(struct xe_device *xe,
>  		info->size = ARRAY_SIZE(dg2_mocs_desc);
>  		info->table = dg2_mocs_desc;
>  		info->uc_index = 1;
> -		info->n_entries = XELP_NUM_MOCS_ENTRIES;
> +		info->n_entries = XEHP_NUM_MOCS_ENTRIES;
>  		info->unused_entries_index = 3;
>  		break;
>  	case XE_DG1:
> -- 
> 2.43.0
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


More information about the Intel-xe mailing list