[PATCH v9 04/11] drm/xe/devcoredump: Add ASCII85 dump helper function

Thu Dec 12 18:45:30 UTC 2024

On Thu, Dec 12, 2024 at 05:41:41PM +0000, Jose Souza wrote:
>On Wed, 2024-10-02 at 17:46 -0700, John.C.Harrison at Intel.com wrote:
>> From: John Harrison <John.C.Harrison at Intel.com>
>>
>> There is a need to include the GuC log and other large binary objects
>> in core dumps and via dmesg. So add a helper for dumping to a printer
>> function via conversion to ASCII85 encoding.
>>
>> Another issue with dumping such a large buffer is that it can be slow,
>> especially if dumping to dmesg over a serial port. So add a yield to
>> prevent the 'task has been stuck for 120s' kernel hang check feature
>> from firing.
>>
>> v2: Add a prefix to the output string. Fix memory allocation bug.
>> v3: Correct a string size calculation and clean up a define (review
>> feedback from Julia F).
>>
>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>> Reviewed-by: Julia Filipchuk <julia.filipchuk at intel.com>
>> ---
>>  drivers/gpu/drm/xe/xe_devcoredump.c | 87 +++++++++++++++++++++++++++++
>>  drivers/gpu/drm/xe/xe_devcoredump.h |  6 ++
>>  2 files changed, 93 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
>> index 2690f1d1cde4..0884c49942fe 100644
>> --- a/drivers/gpu/drm/xe/xe_devcoredump.c
>> +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
>> @@ -6,6 +6,7 @@
>>  #include "xe_devcoredump.h"
>>  #include "xe_devcoredump_types.h"
>>
>> +#include <linux/ascii85.h>
>>  #include <linux/devcoredump.h>
>>  #include <generated/utsrelease.h>
>>
>> @@ -315,3 +316,89 @@ int xe_devcoredump_init(struct xe_device *xe)
>>  }
>>
>>  #endif
>> +
>> +/**
>> + * xe_print_blob_ascii85 - print a BLOB to some useful location in ASCII85
>> + *
>> + * The output is split to multiple lines because some print targets, e.g. dmesg
>> + * cannot handle arbitrarily long lines. Note also that printing to dmesg in
>> + * piece-meal fashion is not possible, each separate call to drm_puts() has a
>> + * line-feed automatically added! Therefore, the entire output line must be
>> + * constructed in a local buffer first, then printed in one atomic output call.
>> + *
>> + * There is also a scheduler yield call to prevent the 'task has been stuck for
>> + * 120s' kernel hang check feature from firing when printing to a slow target
>> + * such as dmesg over a serial port.
>> + *
>> + * TODO: Add compression prior to the ASCII85 encoding to shrink huge buffers down.
>> + *
>> + * @p: the printer object to output to
>> + * @prefix: optional prefix to add to output string
>> + * @blob: the Binary Large OBject to dump out
>> + * @offset: offset in bytes to skip from the front of the BLOB, must be a multiple of sizeof(u32)
>> + * @size: the size in bytes of the BLOB, must be a multiple of sizeof(u32)
>> + */
>> +void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix,
>> +			   const void *blob, size_t offset, size_t size)
>> +{
>> +	const u32 *blob32 = (const u32 *)blob;
>> +	char buff[ASCII85_BUFSZ], *line_buff;
>> +	size_t line_pos = 0;
>> +
>> +#define DMESG_MAX_LINE_LEN	800
>> +#define MIN_SPACE		(ASCII85_BUFSZ + 2)		/* 85 + "\n\0" */
>> +
>> +	if (size & 3)
>> +		drm_printf(p, "Size not word aligned: %zu", size);
>> +	if (offset & 3)
>> +		drm_printf(p, "Offset not word aligned: %zu", size);
>> +
>> +	line_buff = kzalloc(DMESG_MAX_LINE_LEN, GFP_KERNEL);
>> +	if (IS_ERR_OR_NULL(line_buff)) {
>> +		drm_printf(p, "Failed to allocate line buffer: %pe", line_buff);
>> +		return;
>> +	}
>> +
>> +	blob32 += offset / sizeof(*blob32);
>> +	size /= sizeof(*blob32);
>> +
>> +	if (prefix) {
>> +		strscpy(line_buff, prefix, DMESG_MAX_LINE_LEN - MIN_SPACE - 2);
>> +		line_pos = strlen(line_buff);
>> +
>> +		line_buff[line_pos++] = ':';
>> +		line_buff[line_pos++] = ' ';
>> +	}
>> +
>> +	while (size--) {
>> +		u32 val = *(blob32++);
>> +
>> +		strscpy(line_buff + line_pos, ascii85_encode(val, buff),
>> +			DMESG_MAX_LINE_LEN - line_pos);
>> +		line_pos += strlen(line_buff + line_pos);
>> +
>> +		if ((line_pos + MIN_SPACE) >= DMESG_MAX_LINE_LEN) {
>> +			line_buff[line_pos++] = '\n';
>> +			line_buff[line_pos++] = 0;
>
>This breaks ascii85 parser that we had up to now.
>And I think there is not safe way to parse it now, how would the parser know that the blob reach to end?

Just looked at this code to check if we also had a "Size" as I remember
seeing it in previous related patches, but we don't. If
we did then you could use that to calculate how much you still had to
read, ignoring the \n. With the current scheme I think one
way would be to continue the previous key when the line doesn't start
with a new key. Awful, yes, and not future proof.

John, I explicitly said *we can't break the existent users*,
pointed to the one in the mesa repo and confirmed with José it would be
indeed a breakage. Please check again the past email thread at
https://lore.kernel.org/all/3jexgpnh3br3gqi4ol4c3hx3tyhwevq5nqo5xssyie3xglqohz@e7mnj4dewupu/

Did you at least prepare the mesa parser for that additional \n?

We shouldn't have merged the kernel patch as is with the excuse it reads
better in dmesg, when we also said we should not print that garbage to
dmesg :(

Lucas De Marchi