[PATCH 2/2] drm/xe/guc: Add support for NPK as a GuC log target
John Harrison
john.c.harrison at intel.com
Thu Jun 12 17:49:38 UTC 2025
On 6/12/2025 7:32 AM, Cavitt, Jonathan wrote:
> -----Original Message-----
> From: Harrison, John C <john.c.harrison at intel.com>
> Sent: Wednesday, June 11, 2025 4:57 PM
> To: Cavitt, Jonathan <jonathan.cavitt at intel.com>; Intel-Xe at Lists.FreeDesktop.Org
> Subject: Re: [PATCH 2/2] drm/xe/guc: Add support for NPK as a GuC log target
>> On 6/11/2025 3:04 PM, Cavitt, Jonathan wrote:
>>> -----Original Message-----
>>> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of John.C.Harrison at Intel.com
>>> Sent: Wednesday, June 11, 2025 2:06 PM
>>> To: Intel-Xe at Lists.FreeDesktop.Org
>>> Cc: Harrison, John C <john.c.harrison at intel.com>
>>> Subject: [PATCH 2/2] drm/xe/guc: Add support for NPK as a GuC log target
>>>> From: John Harrison <John.C.Harrison at Intel.com>
>>>>
>>>> The GuC has an option to write log data via NPK. This is basically a
>>>> magic IO address that GuC writes arbitrary data to and which can be
>>>> logged by a suitable hardware logger. This can allow retrieval of the
>>>> GuC log in hardware debug environments even when the system as a whole
>>>> dies horribly.
>>>>
>>>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>>> So, this is basically a new modparam value that redirects GuC logs to
>>> a specific IO address? I take it guc_log_target = 2 is the default value, and
>>> guc_log_target = 1 would print the logs to stdout, then? I'd ask why we
>>> use 0 as a default value and not just default to 2 all the time, but I think I
>>> already know why (we need to guard against guc_log_target = 0 anyways
>>> to prevent printing to stdin).
>> Um, read the patch - "(0=memory [default], 1 = NPK, 2 = memory + NPK)".
>> The default is zero. And no, nothing prints to stdout. This is about
>> hardware level debugging. It has nothing to do with stdin/stdout/stderr.
>> Those concepts do not exist in hardware nor in the KMD. If you send the
>> GuC log to the NPK target then you need a hardware debugger (JTAG, etc.)
>> to read it, as described in the commit message.
> Ah, okay. When I read "This is basically a magic IO address that GuC writes
> arbitrary data to", I thought that was indicating that we're writing to the
> in/out/err IO addresses, and that the data being logged "by a suitable
You say "the in/out/err IO addresses" like there is such a thing.
In/out/err generally refers to stdin/stdout/stderr, being the default
first three file handles of a unix process. File handles are not IO
addresses. Such file handles also do not exist in the kernel. And they
absolutely do not exist inside the GT hardware. My comment was referring
to memory mapped IO addresses, i.e. hardware registers. On a normal
system, there is nothing connected to said hardware so the logged output
goes nowhere. However, if you have a hardware debugger attached to the
system then it can trap those accesses and record the log.
> hardware logger" was occurring separately. I guess I also thought NPK
> was a writing protocol and not a hardware address.
NPK is neither a protocol nor an address. It is a block of silicon
called North Peak, also known as the Intel Trace Hub.
>
> It didn't occur to me that we'd need to directly write the logs to the
> hardware logger in cases of catastrophic failure because we already have
> methods of streaming the logs directly via the serial ports. Though I
> suppose that we're talking about different "logs" at this point?
I don't know what logs you are talking about. This patch is quite
clearly only talking about the GuC log. Which is generally accessible
via debugfs snapshots or as part of a devcoredump capture. It is not
ever 'streamed directly via the serial ports'.
John.
>
> The reviewed-by still stands.
> -Jonathan Cavitt
>
>>> I also take it this is modified on boot by, for example, writing
>>> "xe.guc_log_target=1" to CMDLINE_LINUX_DEFAULT as a part of the grub file.
>> That is generally how module parameters work. You can also set via
>> modprobe.conf files as long as the Xe driver is a module and not
>> compiled in.
>>
>> John.
>>
>>> Yeah, seems good.
>>> Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
>>> -Jonathan Cavitt
>>>
>>>> ---
>>>> drivers/gpu/drm/xe/xe_guc.c | 4 ++++
>>>> drivers/gpu/drm/xe/xe_module.c | 4 ++++
>>>> drivers/gpu/drm/xe/xe_module.h | 1 +
>>>> 3 files changed, 9 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
>>>> index e16d19b44bcc..9c0e3113f7d5 100644
>>>> --- a/drivers/gpu/drm/xe/xe_guc.c
>>>> +++ b/drivers/gpu/drm/xe/xe_guc.c
>>>> @@ -35,6 +35,7 @@
>>>> #include "xe_guc_submit.h"
>>>> #include "xe_memirq.h"
>>>> #include "xe_mmio.h"
>>>> +#include "xe_module.h"
>>>> #include "xe_platform_types.h"
>>>> #include "xe_sriov.h"
>>>> #include "xe_uc.h"
>>>> @@ -74,6 +75,9 @@ static u32 guc_ctl_debug_flags(struct xe_guc *guc)
>>>> else
>>>> flags |= FIELD_PREP(GUC_LOG_VERBOSITY, GUC_LOG_LEVEL_TO_VERBOSITY(level));
>>>>
>>>> + if (xe_modparam.guc_log_target)
>>>> + flags |= FIELD_PREP(GUC_LOG_DESTINATION, xe_modparam.guc_log_target);
>>>> +
>>>> return flags;
>>>> }
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
>>>> index 1c4dfafbcd0b..fc8c681819b9 100644
>>>> --- a/drivers/gpu/drm/xe/xe_module.c
>>>> +++ b/drivers/gpu/drm/xe/xe_module.c
>>>> @@ -21,6 +21,7 @@
>>>> struct xe_modparam xe_modparam = {
>>>> .probe_display = true,
>>>> .guc_log_level = 3,
>>>> + .guc_log_target = 0,
>>>> .force_probe = CONFIG_DRM_XE_FORCE_PROBE,
>>>> #ifdef CONFIG_PCI_IOV
>>>> .max_vfs = IS_ENABLED(CONFIG_DRM_XE_DEBUG) ? ~0 : 0,
>>>> @@ -45,6 +46,9 @@ MODULE_PARM_DESC(vram_bar_size, "Set the vram bar size (in MiB) - <0=disable-res
>>>> module_param_named(guc_log_level, xe_modparam.guc_log_level, int, 0600);
>>>> MODULE_PARM_DESC(guc_log_level, "GuC firmware logging level (0=disable, 1..5=enable with verbosity min..max)");
>>>>
>>>> +module_param_named(guc_log_target, xe_modparam.guc_log_target, int, 0600);
>>>> +MODULE_PARM_DESC(guc_log_target, "GuC firmware logging target (0=memory [default], 1 = NPK, 2 = memory + NPK)");
>>>> +
>>>> module_param_named_unsafe(guc_firmware_path, xe_modparam.guc_firmware_path, charp, 0400);
>>>> MODULE_PARM_DESC(guc_firmware_path,
>>>> "GuC firmware path to use instead of the default one");
>>>> diff --git a/drivers/gpu/drm/xe/xe_module.h b/drivers/gpu/drm/xe/xe_module.h
>>>> index 5a3bfea8b7b4..4d978f6f26b6 100644
>>>> --- a/drivers/gpu/drm/xe/xe_module.h
>>>> +++ b/drivers/gpu/drm/xe/xe_module.h
>>>> @@ -14,6 +14,7 @@ struct xe_modparam {
>>>> bool probe_display;
>>>> u32 force_vram_bar_size;
>>>> int guc_log_level;
>>>> + int guc_log_target;
>>>> char *guc_firmware_path;
>>>> char *huc_firmware_path;
>>>> char *gsc_firmware_path;
>>>> --
>>>> 2.49.0
>>>>
>>>>
>>
More information about the Intel-xe
mailing list