[Intel-xe] [PATCH v2 6/7] drm/xe/pat: Add debugfs node to dump PAT
Lucas De Marchi
lucas.demarchi at intel.com
Fri Oct 6 18:15:49 UTC 2023
On Wed, Oct 04, 2023 at 11:07:55PM -0500, Lucas De Marchi wrote:
>On Wed, Oct 04, 2023 at 05:21:48PM -0700, Matt Roper wrote:
>>On Tue, Oct 03, 2023 at 05:52:16AM -0700, Lucas De Marchi wrote:
>>>This is useful to debug cache issues, to double check if the PAT
>>>indexes match what they were supposed to be set to from spec.
>>>
>>>v2: Add separate functions for xehp, xehpc and xelpg so it correctly
>>> reads the index based on MCR/REG registers and also decodes the
>>> fields (Matt Roper)
>>>
>>>Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
>>>---
>>> drivers/gpu/drm/xe/xe_gt_debugfs.c | 12 ++
>>> drivers/gpu/drm/xe/xe_pat.c | 204 ++++++++++++++++++++++++++++-
>>> drivers/gpu/drm/xe/xe_pat.h | 8 ++
>>> 3 files changed, 223 insertions(+), 1 deletion(-)
>>>
>>>diff --git a/drivers/gpu/drm/xe/xe_gt_debugfs.c b/drivers/gpu/drm/xe/xe_gt_debugfs.c
>>>index ec1ae00f6bfc..cd6d28c7b923 100644
>>>--- a/drivers/gpu/drm/xe/xe_gt_debugfs.c
>>>+++ b/drivers/gpu/drm/xe/xe_gt_debugfs.c
>>>@@ -16,6 +16,7 @@
>>> #include "xe_gt_topology.h"
>>> #include "xe_hw_engine.h"
>>> #include "xe_macros.h"
>>>+#include "xe_pat.h"
>>> #include "xe_reg_sr.h"
>>> #include "xe_reg_whitelist.h"
>>> #include "xe_uc_debugfs.h"
>>>@@ -138,6 +139,16 @@ static int workarounds(struct seq_file *m, void *data)
>>> return 0;
>>> }
>>>
>>>+static int pat(struct seq_file *m, void *data)
>>>+{
>>>+ struct xe_gt *gt = node_to_gt(m->private);
>>>+ struct drm_printer p = drm_seq_file_printer(m);
>>>+
>>>+ xe_pat_dump(gt, &p);
>>>+
>>>+ return 0;
>>>+}
>>>+
>>> static const struct drm_info_list debugfs_list[] = {
>>> {"hw_engines", hw_engines, 0},
>>> {"force_reset", force_reset, 0},
>>>@@ -147,6 +158,7 @@ static const struct drm_info_list debugfs_list[] = {
>>> {"ggtt", ggtt, 0},
>>> {"register-save-restore", register_save_restore, 0},
>>> {"workarounds", workarounds, 0},
>>>+ {"pat", pat, 0},
>>> };
>>>
>>> void xe_gt_debugfs_register(struct xe_gt *gt)
>>>diff --git a/drivers/gpu/drm/xe/xe_pat.c b/drivers/gpu/drm/xe/xe_pat.c
>>>index 296763594370..1c89137fc66b 100644
>>>--- a/drivers/gpu/drm/xe/xe_pat.c
>>>+++ b/drivers/gpu/drm/xe/xe_pat.c
>>>@@ -6,6 +6,8 @@
>>> #include "xe_pat.h"
>>>
>>> #include "regs/xe_reg_defs.h"
>>>+#include "xe_assert.h"
>>>+#include "xe_device.h"
>>> #include "xe_gt.h"
>>> #include "xe_gt_mcr.h"
>>> #include "xe_mmio.h"
>>>@@ -14,6 +16,7 @@
>>> #define _PAT_INDEX(index) _PICK_EVEN_2RANGES(index, 8, \
>>> 0x4800, 0x4804, \
>>> 0x4848, 0x484c)
>>>+#define _PAT_PTA 0x4820
>>>
>>> #define XE2_NO_PROMOTE REG_BIT(10)
>>> #define XE2_COMP_EN REG_BIT(9)
>>>@@ -26,10 +29,12 @@
>>> #define XELPG_PAT_3_UC REG_FIELD_PREP(XELPG_L4_POLICY_MASK, 3)
>>> #define XELPG_PAT_1_WT REG_FIELD_PREP(XELPG_L4_POLICY_MASK, 1)
>>> #define XELPG_PAT_0_WB REG_FIELD_PREP(XELPG_L4_POLICY_MASK, 0)
>>>+#define XELPG_L4_POLICY_STR_MAP { "WB", "WT", "(N/A)", "UC" }
>>> #define XELPG_INDEX_COH_MODE_MASK REG_GENMASK(1, 0)
>>> #define XELPG_3_COH_2W REG_FIELD_PREP(XELPG_INDEX_COH_MODE_MASK, 3)
>>> #define XELPG_2_COH_1W REG_FIELD_PREP(XELPG_INDEX_COH_MODE_MASK, 2)
>>> #define XELPG_0_COH_NON REG_FIELD_PREP(XELPG_INDEX_COH_MODE_MASK, 0)
>>>+#define XELPG_INDEX_COH_MODE_STR_MAP { "NO", "(N/A)", "1-WAY", "2-WAY" }
>>>
>>> #define XEHPC_CLOS_LEVEL_MASK REG_GENMASK(3, 2)
>>> #define XEHPC_PAT_CLOS(x) REG_FIELD_PREP(XEHPC_CLOS_LEVEL_MASK, x)
>>>@@ -39,10 +44,12 @@
>>> #define XELP_PAT_WT REG_FIELD_PREP(XELP_MEM_TYPE_MASK, 2)
>>> #define XELP_PAT_WC REG_FIELD_PREP(XELP_MEM_TYPE_MASK, 1)
>>> #define XELP_PAT_UC REG_FIELD_PREP(XELP_MEM_TYPE_MASK, 0)
>>>+#define XELP_MEM_TYPE_STR_MAP { "UC", "WC", "WT", "WB" }
>>
>>Is there a benefit to making a #define that builds three separate tables
>>inside different functions rather than just adding one file-scope table
>>that can be referenced from each of those functions?
>
>As long as the array is const, it should be equivalent, but I didn't
>really check the alternative to see what the compiler is doing.
>I thought about this so the define is next to the entries and it's easy
>to check it's correct. I will take a look on the alternative.
there was a ~100B decrease in size when using a static const array
shared with all function. It's very small, but pairing with the other
changes below about not trying to print the human-friendly string, made
it better IMO.
>>I guess using human-readable strings works, although I was originally
>>expecting these to dump in the same format as the Xe2 table below to
>>make it really quick/easy to compare against the way these are written
>>in the bspec. E.g.,
>>
>> PAT[0] = [ 0, 0 ] (xxxx)
>> PAT[1] = [ 0, 1 ] (xxxx)
>> ...
>> PAT[7] = [ 2, 3 ] (xxxx)
>>
>>I don't have strong feelings about it though since the pre-Xe2 tables
>>are really short and simple. This approach looks fine too.
>>
>>Reviewed-by: Matt Roper <matthew.d.roper at intel.com>
new output for MTL:
/sys/kernel/debug/dri/0# cat gt1/pat
PAT table:
PAT[ 0] = [ 0, 0 ] ( 0x0)
PAT[ 1] = [ 1, 0 ] ( 0x4)
PAT[ 2] = [ 3, 0 ] ( 0xc)
PAT[ 3] = [ 0, 2 ] ( 0x2)
PAT[ 4] = [ 0, 3 ] ( 0x3)
Lucas De Marchi
More information about the Intel-xe
mailing list