[PATCH 1/2] drm/xe/guc: Default log level to non-verbose

John Harrison john.c.harrison at intel.com
Thu Jun 12 18:21:38 UTC 2025


On 6/12/2025 11:05 AM, Lucas De Marchi wrote:
> Currently xe sets the guc log level to a verbose level since it's useful
> to debug hangs and general development. However the verbose level may
> already be too much and affect performance.
>
> Michal Mrozek did some tests with the L0 compute stack for submission
> latency with ULLS disabled. Below are the normalized numbers with log
> level 3 (the current default) as baseline for each test:
>
>                            Test \ Log Level                        3      0      1      2
>   ----------------------------------------------------------- ------ ------ ------ ------
>    BestWalkerNthCommandListSubmission(CmdListCount=2)           1.00   0.63   0.63   0.96
>    BestWalkerNthSubmission(KernelCount=2)                       1.00   0.62   0.63   0.96
>    BestWalkerNthSubmissionImmediate(KernelCount=2)              1.00   0.58   0.58   0.85
>    BestWalkerSubmission                                         1.00   0.62   0.62   0.96
>    BestWalkerSubmissionImmediate                                1.00   0.63   0.62   0.96
>    BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=2)   1.00   0.58   0.58   0.86
>    BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=4)   1.00   0.70   0.70   0.83
>    BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=8)   1.00   0.53   0.52   0.78
>
> Log level 2 is the first "verbose level" for GuC, where the biggest
> difference happens. Keep log level 3 for CONFIG_DRM_XE_DEBUG, but switch
> to 1, i.e.  GUC_LOG_LEVEL_NON_VERBOSE, for "normal" builds.
Note that this performance is understood, although it was not realised 
quite how much of a hit it was on this benchmark. The impact comes from 
logging around context switches. The logging adds a few microseconds to 
the context switch time. In general, this is not noticeable as the 
context switch time is negligible compared to the runtime for the 
workload itself. However, I'm guessing from the name that this benchmark 
is specifically measuring context switch performance with empty 
workloads. Thus it is the pathological worst case scenario with regards 
to the impact of the logging.

Anyway, not logging in release builds is generally a good idea and 
better benchmark scores are always good :).

Reviewed-by: John Harrison <John.C.Harrison at Intel.com>

>
> Cc: Michal Mrozek <michal.mrozek at intel.com>
> Cc: John Harrison <John.C.Harrison at Intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
> ---
>   drivers/gpu/drm/xe/xe_module.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
> index 1c4dfafbcd0bc..4809afa7ce3f9 100644
> --- a/drivers/gpu/drm/xe/xe_module.c
> +++ b/drivers/gpu/drm/xe/xe_module.c
> @@ -20,7 +20,7 @@
>   
>   struct xe_modparam xe_modparam = {
>   	.probe_display = true,
> -	.guc_log_level = 3,
> +	.guc_log_level = IS_ENABLED(CONFIG_DRM_XE_DEBUG) ? 3 : 1,
>   	.force_probe = CONFIG_DRM_XE_FORCE_PROBE,
>   #ifdef CONFIG_PCI_IOV
>   	.max_vfs = IS_ENABLED(CONFIG_DRM_XE_DEBUG) ? ~0 : 0,
>



More information about the Intel-xe mailing list