[PATCH v2 1/2] drm/xe: Fix and re-enable xe_print_blob_ascii85()

John Harrison john.c.harrison at intel.com
Thu Jan 23 21:16:51 UTC 2025


On 1/23/2025 13:14, John Harrison wrote:
> On 1/23/2025 12:22, José Roberto de Souza wrote:
>> From: Lucas De Marchi <lucas.demarchi at intel.com>
>>
>> Commit 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa
>> debug tool") partially reverted some changes to workaround breakage
>> caused to mesa tools. However, in doing so it also broke fetching the
>> GuC log via debugfs since xe_print_blob_ascii85() simply bails out.
>>
>> The fix is to avoid the extra newlines: the devcoredump interface is
>> line-oriented and adding random newlines in the middle breaks it. If a
>> tool is able to parse it by looking at the data and checking for chars
>> that are out of the ascii85 space, it can still do so. A format change
>> that breaks the line-oriented output on devcoredump however needs better
>> coordination with existing tools.
>>
>> v2:
>> - added suffix description comment
>>
>> Reviewed-by: José Roberto de Souza <jose.souza at intel.com>
>> Cc: John Harrison <John.C.Harrison at Intel.com>
>> Cc: Julia Filipchuk <julia.filipchuk at intel.com>
>> Cc: José Roberto de Souza <jose.souza at intel.com>
>> Cc: stable at vger.kernel.org
>> Fixes: 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa 
>> debug tool")
>> Fixes: ec1455ce7e35 ("drm/xe/devcoredump: Add ASCII85 dump helper 
>> function")
>> Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_devcoredump.c | 33 +++++++++++------------------
>>   drivers/gpu/drm/xe/xe_devcoredump.h |  2 +-
>>   drivers/gpu/drm/xe/xe_guc_ct.c      |  3 ++-
>>   drivers/gpu/drm/xe/xe_guc_log.c     |  4 +++-
>>   4 files changed, 18 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c 
>> b/drivers/gpu/drm/xe/xe_devcoredump.c
>> index 81dc7795c0651..6f73b1ba0f2aa 100644
>> --- a/drivers/gpu/drm/xe/xe_devcoredump.c
>> +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
>> @@ -395,42 +395,33 @@ int xe_devcoredump_init(struct xe_device *xe)
>>   /**
>>    * xe_print_blob_ascii85 - print a BLOB to some useful location in 
>> ASCII85
>>    *
>> - * The output is split to multiple lines because some print targets, 
>> e.g. dmesg
>> - * cannot handle arbitrarily long lines. Note also that printing to 
>> dmesg in
>> - * piece-meal fashion is not possible, each separate call to 
>> drm_puts() has a
>> - * line-feed automatically added! Therefore, the entire output line 
>> must be
>> - * constructed in a local buffer first, then printed in one atomic 
>> output call.
>> + * The output is split to multiple print calls because some print 
>> targets, e.g.
>> + * dmesg cannot handle arbitrarily long lines. These targets may add 
>> newline
>> + * between calls.
> As per earlier comments, this change implies that dmesg output is now 
> supported as long as a newline is added between calls. That is very 
> definitely not the case.
>
>>    *
>>    * There is also a scheduler yield call to prevent the 'task has 
>> been stuck for
>>    * 120s' kernel hang check feature from firing when printing to a 
>> slow target
>>    * such as dmesg over a serial port.
>>    *
>> - * TODO: Add compression prior to the ASCII85 encoding to shrink 
>> huge buffers down.
>> - *
>>    * @p: the printer object to output to
>>    * @prefix: optional prefix to add to output string
>> + * @suffix: optional suffix to add at the end. 0 disables it and is
>> + *          not added to the output, which is useful when using 
>> multiple calls
>> + *          to dump data to @p
>>    * @blob: the Binary Large OBject to dump out
>>    * @offset: offset in bytes to skip from the front of the BLOB, 
>> must be a multiple of sizeof(u32)
>>    * @size: the size in bytes of the BLOB, must be a multiple of 
>> sizeof(u32)
>>    */
>> -void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix,
>> +void xe_print_blob_ascii85(struct drm_printer *p, const char 
>> *prefix, char suffix,
>>                  const void *blob, size_t offset, size_t size)
>>   {
>>       const u32 *blob32 = (const u32 *)blob;
>>       char buff[ASCII85_BUFSZ], *line_buff;
>>       size_t line_pos = 0;
>>   -    /*
>> -     * Splitting blobs across multiple lines is not compatible with 
>> the mesa
>> -     * debug decoder tool. Note that even dropping the explicit '\n' 
>> below
>> -     * doesn't help because the GuC log is so big some underlying 
>> implementation
>> -     * still splits the lines at 512K characters. So just bail 
>> completely for
>> -     * the moment.
>> -     */
>> -    return;
>> -
>>   #define DMESG_MAX_LINE_LEN    800
>> -#define MIN_SPACE        (ASCII85_BUFSZ + 2)        /* 85 + "\n\0" */
>> +    /* Always leave space for the suffix char and the \0 */
>> +#define MIN_SPACE        (ASCII85_BUFSZ + 2)    /* 85 + "<suffix>\0" */
>>         if (size & 3)
>>           drm_printf(p, "Size not word aligned: %zu", size);
>> @@ -462,7 +453,6 @@ void xe_print_blob_ascii85(struct drm_printer *p, 
>> const char *prefix,
>>           line_pos += strlen(line_buff + line_pos);
>>             if ((line_pos + MIN_SPACE) >= DMESG_MAX_LINE_LEN) {
>> -            line_buff[line_pos++] = '\n';
> Again, as already commented, do not completely remove this line. It is 
> an absolute requirement for dmesg output. And dmesg output is an 
> important debug facility.
>
> It should be temporarily commented out with a comment saying "this is 
> required for dumping to dmesg but currently breaks a mesa debug tool 
> so is disabled by default". That way it is clear what a developer 
> needs to do to re-enable dmesg output locally.
>
>>               line_buff[line_pos++] = 0;
>>                 drm_puts(p, line_buff);
>> @@ -474,10 +464,11 @@ void xe_print_blob_ascii85(struct drm_printer 
>> *p, const char *prefix,
>>           }
>>       }
>>   +    if (suffix)
>> +        line_buff[line_pos++] = suffix;
>> +
>>       if (line_pos) {
>> -        line_buff[line_pos++] = '\n';
>>           line_buff[line_pos++] = 0;
>> -
>>           drm_puts(p, line_buff);
>>       }
>>   diff --git a/drivers/gpu/drm/xe/xe_devcoredump.h 
>> b/drivers/gpu/drm/xe/xe_devcoredump.h
>> index 6a17e6d601022..5391a80a4d1ba 100644
>> --- a/drivers/gpu/drm/xe/xe_devcoredump.h
>> +++ b/drivers/gpu/drm/xe/xe_devcoredump.h
>> @@ -29,7 +29,7 @@ static inline int xe_devcoredump_init(struct 
>> xe_device *xe)
>>   }
>>   #endif
>>   -void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix,
>> +void xe_print_blob_ascii85(struct drm_printer *p, const char 
>> *prefix, char suffix,
>>                  const void *blob, size_t offset, size_t size);
>>     #endif
>> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c 
>> b/drivers/gpu/drm/xe/xe_guc_ct.c
>> index 8b65c5e959cc2..50c8076b51585 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
>> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
>> @@ -1724,7 +1724,8 @@ void xe_guc_ct_snapshot_print(struct 
>> xe_guc_ct_snapshot *snapshot,
>>                  snapshot->g2h_outstanding);
>>             if (snapshot->ctb)
>> -            xe_print_blob_ascii85(p, "CTB data", snapshot->ctb, 0, 
>> snapshot->ctb_size);
>> +            xe_print_blob_ascii85(p, "CTB data", '\n',
>> +                          snapshot->ctb, 0, snapshot->ctb_size);
>>       } else {
>>           drm_puts(p, "CT disabled\n");
>>       }
>> diff --git a/drivers/gpu/drm/xe/xe_guc_log.c 
>> b/drivers/gpu/drm/xe/xe_guc_log.c
>> index 80151ff6a71f8..44482ea919924 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_log.c
>> +++ b/drivers/gpu/drm/xe/xe_guc_log.c
>> @@ -207,8 +207,10 @@ void xe_guc_log_snapshot_print(struct 
>> xe_guc_log_snapshot *snapshot, struct drm_
>>       remain = snapshot->size;
>>       for (i = 0; i < snapshot->num_chunks; i++) {
>>           size_t size = min(GUC_LOG_CHUNK_SIZE, remain);
>> +        const char *prefix = i ? NULL : "Log data";
>> +        char suffix = i == snapshot->num_chunks - 1 ? '\n' : 0;
>>   -        xe_print_blob_ascii85(p, i ? NULL : "Log data", 
>> snapshot->copy[i], 0, size);
>> +        xe_print_blob_ascii85(p, prefix, suffix, snapshot->copy[i], 
>> 0, size);
> I thought you were saying that these need to follow the mesa 
> requirement of "[name].length" + "[name].data"?
>
> John.
Sorry, patch 2 wasn't showing up at first. Just saw it now!

John.

>
>
>>           remain -= size;
>>       }
>>   }
>



More information about the Intel-xe mailing list