[CI 12/12] [For CI] Debug prints/logs to debug G2H timeout issue

John Harrison john.c.harrison at intel.com
Mon Sep 23 17:09:49 UTC 2024


On 9/20/2024 07:29, Badal Nilawar wrote:
> From: bnilawar <badal.nilawar at intel.com>
>
> Debug print and GuC log in dmesg to debug possible G2H irq delay / miss
>
> Signed-off-by: Badal Nilawar <badal.nilawar at intel.com>
> ---
>   drivers/gpu/drm/xe/xe_guc.c    | 4 +++-
>   drivers/gpu/drm/xe/xe_guc_ct.c | 2 ++
>   2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> index 41ff4fe65f8b..00c0550fb80f 100644
> --- a/drivers/gpu/drm/xe/xe_guc.c
> +++ b/drivers/gpu/drm/xe/xe_guc.c
> @@ -1094,8 +1094,10 @@ int xe_guc_self_cfg64(struct xe_guc *guc, u16 key, u64 val)
>   
>   void xe_guc_irq_handler(struct xe_guc *guc, const u16 iir)
>   {
> -	if (iir & GUC_INTR_GUC2HOST)
> +	if (iir & GUC_INTR_GUC2HOST) {
> +		xe_gt_dbg(guc_to_gt(guc), "G2H IRQ GT[%d]\n", guc_to_gt(guc)->info.id);
>   		xe_guc_ct_irq_handler(&guc->ct);
> +	}
>   }
>   
>   void xe_guc_sanitize(struct xe_guc *guc)
> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> index 5546d4f87ebb..8f2d046bf1b5 100644
> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> @@ -1026,6 +1026,7 @@ static int guc_ct_send_recv(struct xe_guc_ct *ct, const u32 *action, u32 len,
>   		xe_gt_err(gt, "Timed out wait for G2H, fence %u, action %04x",
>   			  g2h_fence.seqno, action[0]);
>   		xa_erase_irq(&ct->fence_lookup, g2h_fence.seqno);
> +		xe_guc_log_print_dmesg(&ct_to_guc(ct)->log);
>   		return -ETIME;
>   	}
>   
> @@ -1141,6 +1142,7 @@ static int parse_g2h_response(struct xe_guc_ct *ct, u32 *msg, u32 len)
>   		xe_gt_warn(gt, "G2H fence (%u) not found!\n", fence);
>   		CT_DEAD(ct, NULL, PARSE_G2H_UNKNOWN);
>   		g2h_release_space(ct, GUC_CTB_HXG_MSG_MAX_LEN);
> +		xe_guc_log_print_dmesg(&ct_to_guc(ct)->log);
The whole point of the CT_DEAD two lines above is to print the GuC log 
and a whole bunch of other useful stuff to dmesg (including things like 
the CT contents which could be helpful given the description of the 
issue you are investigating). Adding an extra GuC log print on top is 
just duplicating the spew.

Likewise, it might be better to add a new CT_DEAD line instead of a call 
to xe_guc_log_print_dmesg() in guc_ct_send_recv() above.

John.

>   		return -EPROTO;
>   	}
>   



More information about the Intel-xe mailing list