[PATCH v2 2/2] drm/xe/guc: Only add GuC crash dump if available
Dong, Zhanjun
zhanjun.dong at intel.com
Tue Apr 8 14:51:26 UTC 2025
Thanks for review, please see my comments inline below.
Regards,
Zhanjun Dong
On 2025-04-03 5:46 p.m., John Harrison wrote:
> On 3/27/2025 4:40 PM, Zhanjun Dong wrote:
>> Add flag of GuC crash dump received. LFD only include crash dump
>> section when crash dump is available.
>>
>> Signed-off-by: Zhanjun Dong <zhanjun.dong at intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_guc_ct.c | 13 +++++++-----
>> drivers/gpu/drm/xe/xe_guc_log.c | 30 +++++++++++++++++++++++++++
>> drivers/gpu/drm/xe/xe_guc_log_types.h | 2 ++
>> 3 files changed, 40 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/
>> xe_guc_ct.c
>> index 72ad576fc18e..44c11ec662e5 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
>> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
>> @@ -1127,12 +1127,15 @@ static int guc_crash_process_msg(struct
>> xe_guc_ct *ct, u32 action)
>> {
>> struct xe_gt *gt = ct_to_gt(ct);
>> - if (action == XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED)
>> + if (action == XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED) {
>> xe_gt_err(gt, "GuC Crash dump notification\n");
>> - else if (action == XE_GUC_ACTION_NOTIFY_EXCEPTION)
>> - xe_gt_err(gt, "GuC Exception notification\n");
>> - else
>> - xe_gt_err(gt, "Unknown GuC crash notification: 0x%04X\n",
>> action);
>> + ct_to_guc(ct)->log.crash_dumped = true;
> This will also need to be cleared in the GuC reset path. There is no
> guarantee that the log will be saved via the LFD system before a reset
> wipes it out. And then a subsequent save will see a stale crash dump.
Good point, meanwhile, I wonder if this "crash_dumped" could be removed.
If there is no crash, then crash dump is always all zero, as I already
has code to check this all zero, then this "crash_dumped" could be
removed.>
>> + } else {
>> + if (action == XE_GUC_ACTION_NOTIFY_EXCEPTION)
> You can use "} else if( ..." to avoid the unnecessary extra level of
> indentation.
Sure, will do
>
>> + xe_gt_err(gt, "GuC Exception notification\n");
>> + else
>> + xe_gt_err(gt, "Unknown GuC crash notification: 0x%04X\n",
>> action);
>> + }
>> CT_DEAD(ct, NULL, CRASH);
>> diff --git a/drivers/gpu/drm/xe/xe_guc_log.c b/drivers/gpu/drm/xe/
>> xe_guc_log.c
>> index 5659d60e41ab..29684393a62d 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_log.c
>> +++ b/drivers/gpu/drm/xe/xe_guc_log.c
>> @@ -536,6 +536,36 @@ static uint xe_guc_log_save_to_lfd_buf(char *buf,
>> int size, u32 *guc_log_bin,
>> return len;
>> index += len;
>> + /* For Crash dump, rd/wr ptr has no effect, only add if
>> crash_dumped is true */
>> + if (log->crash_dumped) {
>> + struct guc_log_buffer_entry_list *entry;
>> +
>> + entry = &entry_list[GUC_LOG_BUFFER_STATE_HEADER_ENTRY_CRASH];
>> + if (entry->buf_size) {
>> + int i;
>> + u32 *buf32 = (u32 *)&bin[entry->offset];
>> +
>> + /* Check if crash dump section are all zero */
>> + for (i = 0; i < entry->buf_size / 4; i++)
>> + if (buf32[i])
>> + break;
>> +
>> + /* Buffer has non-zero data */
>> + if (i < entry->buf_size / 4) {
>> + len = xe_guc_log_add_typed_payload(&buf[index], size
>> - index,
>> + GUC_LFD_TYPE_FW_CRASH_DUMP,
>> + entry->buf_size,
>> + &bin[entry->offset]);
>> + if (len < 0)
>> + return len;
>> + index += len;
>> +
>> + /* Clear flag */
>> + log->crash_dumped = false;
>> + }
>> + }
>> + }
>> +
>> return index;
>> }
>> diff --git a/drivers/gpu/drm/xe/xe_guc_log_types.h b/drivers/gpu/drm/
>> xe/xe_guc_log_types.h
>> index b3d5c72ac752..d351f639727b 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_log_types.h
>> +++ b/drivers/gpu/drm/xe/xe_guc_log_types.h
>> @@ -46,6 +46,8 @@ struct xe_guc_log {
>> u32 level;
>> /** @bo: XE BO for GuC log */
>> struct xe_bo *bo;
>> + /** @crash_dumped: Indicate if crash dumped */
>> + bool crash_dumped;
>> /** @stats: logging related stats */
>> struct {
>> u32 sampled_overflow;
>
More information about the Intel-xe
mailing list