[PATCH 4/4] drm/xe/guc: Expose raw access to GuC log over debugfs

John Harrison john.c.harrison at intel.com
Mon May 13 16:53:19 UTC 2024


On 5/12/2024 08:36, Michal Wajdeczko wrote:
> We already provide the content of the GuC log in debugsfs, but it
> is in a text format where each log dword is printed as hexadecimal
> number, which does not scale well with large GuC log buffers.
>
> To allow more efficient access to the GuC log, which could benefit
> our CI systems, expose raw binary log data.  In addition to less
> overhead in preparing text based GuC log file, the new GuC log file
> in binary format is also almost 3x smaller.
>
> Any existing script that expects the GuC log buffer in text format
> can use command like below to convert from new binary format:
>
> 	hexdump -e '4/4 "0x%08x " "\n"'
>
> but this shouldn't be the case as most decoders expect GuC log data
> in binary format.
I strongly disagree with this.

Efficiency and file size is not an issue when accessing the GuC log via 
debugfs on actual hardware. It is an issue when dumping via dmesg but 
you definitely should not be dumping binary data to dmesg. Whereas, 
dumping in binary data is much more dangerous and liable to corruption 
because some tool along the way tries to convert to ASCII, or truncates 
at the first zero, etc. We request GuC logs be sent by end users, 
customer bug reports, etc. all doing things that we have no control over.

Converting the hexdump back to binary is trivial for those tools which 
require it. If you follow the acquisition and decoding instructions on 
the wiki page then it is all done for you automatically.

These patches are trying to solve a problem which does not exist and are 
going to make working with GuC logs harder and more error prone.

On the other hand, there are many other issues with GuC logs that it 
would be useful to solves - including extra meta data, reliable output 
via dmesg, continuous streaming, pre-sizing the debugfs file to not have 
to generate it ~12 times for a single read, etc.

Hmm. Actually, is this interface allowing the filesystem layers to issue 
multiple read calls to read the buffer out in small chunks? That is also 
going to break things. If the GuC is still writing to the log as the 
user is reading from it, there is the opportunity for each chunk to not 
follow on from the previous chunk because the data has just been 
overwritten. This is already a problem at the moment that causes issues 
when decoding the logs, even with an almost atomic copy of the log into 
a temporary buffer before reading it out. Doing the read in separate 
chunks is only going to make that problem even worse.

John.

> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> Cc: John Harrison <John.C.Harrison at Intel.com>
> ---
> Cc: linux-fsdevel at vger.kernel.org
> Cc: dri-devel at lists.freedesktop.org
> ---
>   drivers/gpu/drm/xe/xe_guc_debugfs.c | 26 ++++++++++++++++++++++++++
>   1 file changed, 26 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_debugfs.c b/drivers/gpu/drm/xe/xe_guc_debugfs.c
> index d3822cbea273..53fea952344d 100644
> --- a/drivers/gpu/drm/xe/xe_guc_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_guc_debugfs.c
> @@ -8,6 +8,7 @@
>   #include <drm/drm_debugfs.h>
>   #include <drm/drm_managed.h>
>   
> +#include "xe_bo.h"
>   #include "xe_device.h"
>   #include "xe_gt.h"
>   #include "xe_guc.h"
> @@ -52,6 +53,29 @@ static const struct drm_info_list debugfs_list[] = {
>   	{"guc_log", guc_log, 0},
>   };
>   
> +static ssize_t guc_log_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
> +{
> +	struct dentry *dent = file_dentry(file);
> +	struct dentry *uc_dent = dent->d_parent;
> +	struct dentry *gt_dent = uc_dent->d_parent;
> +	struct xe_gt *gt = gt_dent->d_inode->i_private;
> +	struct xe_guc_log *log = &gt->uc.guc.log;
> +	struct xe_device *xe = gt_to_xe(gt);
> +	ssize_t ret;
> +
> +	xe_pm_runtime_get(xe);
> +	ret = xe_map_read_from(xe, buf, count, pos, &log->bo->vmap, log->bo->size);
> +	xe_pm_runtime_put(xe);
> +
> +	return ret;
> +}
> +
> +static const struct file_operations guc_log_ops = {
> +	.owner		= THIS_MODULE,
> +	.read		= guc_log_read,
> +	.llseek		= default_llseek,
> +};
> +
>   void xe_guc_debugfs_register(struct xe_guc *guc, struct dentry *parent)
>   {
>   	struct drm_minor *minor = guc_to_xe(guc)->drm.primary;
> @@ -72,4 +96,6 @@ void xe_guc_debugfs_register(struct xe_guc *guc, struct dentry *parent)
>   	drm_debugfs_create_files(local,
>   				 ARRAY_SIZE(debugfs_list),
>   				 parent, minor);
> +
> +	debugfs_create_file("guc_log_raw", 0600, parent, NULL, &guc_log_ops);
>   }



More information about the dri-devel mailing list