[PATCH v6 0/8] drm/xe/guc: Add GuC based register capture for error capture

Souza, Jose jose.souza at intel.com
Fri Mar 22 16:57:41 UTC 2024


On Tue, 2024-03-19 at 07:36 -0700, Zhanjun Dong wrote:
> Port GuC based register capture for error capture from i915 to Xe.
> 
> There are 3 parts inside:
> . Prepare for capture registers
>     There is a bo create at guc ads init time, that is very early
>     and engi  ne map is not ready, make it hard to calculate the
>     capture buffer size, new function created for worst case size
>     caluation. Other than that, this part basically follows the i915
>     design.
> . Process capture notification message
>     Basically follows i915 design
> . Sysfs command process.
>     Xe switched to devcoredump, adopted command line process with
>     captured node list.

Just some notes after trying this to debug a issue with this series:

- CONFIG_DRM_XE_CAPTURE_ERROR is not a good name for a config to enable GuC based capture.
	- in my opinion this should not even exist, always capture with GuC if xe_device_uc_enabled() is true. 

- coredump don't have some registers like SAMPLER_INSTDONE, ROW_INSTDONE, XEHPG_INSTDONE_GEOM_SVG...
Here the output in LNL:

**** HW Engines ****
rcs0 (physical), logical instance=0
	Forcewake: domain 0x2, ref 1
	HWSTAM: 0xffffffff
	RING_HWS_PGA: 0x01693000
	RING_START: 0x029d5000
	RING_HEAD: 0x00001914
	RING_TAIL: 0x00001968
	RING_CTL: 0x00003001
	RING_MI_MODE: 0x00001000
	RING_MODE: 0x00000008
	RING_IMR: 0x00000000
	RING_ESR: 0x00000000
	RING_EMR: 0xffffffff
	RING_EIR: 0x00000000
	IPEHR: 0x7a000004
	ACTHD: 0x0000effeffff9208
	RING_BBADDR: 0x0000effeffff9209
	RING_DMA_FADD: 0x00000000029d6968
	RING_EXECLIST_STATUS: 0x000000004000279c
	RING_EXECLIST_SQ_CONTENTS: 0x00000000029d9719

- a lot of style issues, see 'CI.checkpatch'

> 
> Signed-off-by: Zhanjun Dong <zhanjun.dong at intel.com>
> Cc: Alan Previn <alan.previn.teres.alexis at intel.com>
> 
> Changes from prior revs:
>   v6:-  Change hardcoded register snapshot fill to follow mapping tables
>         When capture is empty, take snapshot from engine
>   v5:-  Split dss helper code out as an standalone patch
>         Remove old platform registers definition.
>         Split register map table to 32 and 64bit each
>   v4:-  Move register map table to xe_hw_engine.c
>   v3:-  Remove condition compilation in code
>   v2:-  Split into multiple chunks
> 
> Zhanjun Dong (8):
>   drm/xe/guc: Add kconfig for GuC based register capture
>   drm/xe/guc: Update GuC ADS size for error capture
>   drm/xe/guc: Add XE_LP steered register lists
>   drm/xe/guc: Add capture size check in GuC log buffer
>   drm/xe/guc: Check sizing of guc_capture output
>   drm/xe/guc: Extract GuC error capture lists on G2H notification
>   drm/xe/guc: Pre-allocate output nodes for extraction
>   drm/xe/guc: Plumb GuC-capture into dev coredump
> 
>  drivers/gpu/drm/xe/Kconfig               |   11 +
>  drivers/gpu/drm/xe/Makefile              |    1 +
>  drivers/gpu/drm/xe/abi/guc_actions_abi.h |    7 +
>  drivers/gpu/drm/xe/regs/xe_gt_regs.h     |    5 +
>  drivers/gpu/drm/xe/xe_gt_printk.h        |    3 +
>  drivers/gpu/drm/xe/xe_guc.c              |    6 +
>  drivers/gpu/drm/xe/xe_guc_ads.c          |  229 +++-
>  drivers/gpu/drm/xe/xe_guc_ads_types.h    |    2 +
>  drivers/gpu/drm/xe/xe_guc_capture.c      | 1332 ++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_guc_capture.h      |   21 +
>  drivers/gpu/drm/xe/xe_guc_capture_fwif.h |  221 ++++
>  drivers/gpu/drm/xe/xe_guc_ct.c           |    2 +
>  drivers/gpu/drm/xe/xe_guc_fwif.h         |   68 ++
>  drivers/gpu/drm/xe/xe_guc_log.c          |  179 +++
>  drivers/gpu/drm/xe/xe_guc_log.h          |   15 +
>  drivers/gpu/drm/xe/xe_guc_log_types.h    |   26 +
>  drivers/gpu/drm/xe/xe_guc_submit.c       |   21 +-
>  drivers/gpu/drm/xe/xe_guc_submit.h       |    3 +
>  drivers/gpu/drm/xe/xe_guc_types.h        |    2 +
>  drivers/gpu/drm/xe/xe_hw_engine.c        |  261 ++++-
>  drivers/gpu/drm/xe/xe_hw_engine.h        |    4 +
>  drivers/gpu/drm/xe/xe_hw_engine_types.h  |  117 +-
>  22 files changed, 2421 insertions(+), 115 deletions(-)
>  create mode 100644 drivers/gpu/drm/xe/xe_guc_capture.c
>  create mode 100644 drivers/gpu/drm/xe/xe_guc_capture.h
>  create mode 100644 drivers/gpu/drm/xe/xe_guc_capture_fwif.h
> 



More information about the Intel-xe mailing list