[PATCH 1/2] drm/xe/guc: Fix out-of-bound while enabling engine activity stats
Riana Tauro
riana.tauro at intel.com
Tue Apr 15 07:29:23 UTC 2025
Hi Michal
On 4/15/2025 1:53 AM, Michal Wajdeczko wrote:
> In the PF mode we allocate array of struct engine_activity_group
> that holds activity data split for the PF and all potential VFs.
> But while preparing data for use by VFs we ended with bad index.
>
> [ ] BUG: KASAN: slab-out-of-bounds in xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe]
> [ ] Call Trace:
> [ ] <TASK>
> [ ] dump_stack_lvl+0x91/0xf0
> [ ] print_report+0xd1/0x680
> [ ] ? __virt_addr_valid+0x23a/0x440
> [ ] ? kasan_addr_to_slab+0xd/0xb0
> [ ] kasan_report+0xe7/0x130
> [ ] ? xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe]
> [ ] ? xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe]
> [ ] __asan_report_store8_noabort+0x17/0x30
> [ ] xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe]
> [ ] pf_engine_activity_stats+0x1b6/0x7f0 [xe]
> [ ] ? kobject_put+0x5f/0x470
> [ ] xe_pci_sriov_configure+0x28c9/0x3270 [xe]
> [ ] ? __pfx_dev_attr_store+0x10/0x10
> [ ] ? kstrtoull+0x3b/0x70
> [ ] ? __pfx___lock_acquire+0x10/0x10
> [ ] ? kstrtou16+0x65/0xf0
> [ ] sriov_numvfs_store+0x20c/0x400
> [ ] ? __pfx_sriov_numvfs_store+0x10/0x10
> [ ] ? __pfx__copy_from_iter+0x10/0x10
> [ ] ? __pfx_dev_attr_store+0x10/0x10
> [ ] dev_attr_store+0x3b/0x80
> [ ] ? sysfs_file_ops+0x135/0x190
>
> Fixes: 2de3f38fbf89 ("drm/xe: Add support for per-function engine activity")
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
> Cc: Riana Tauro <riana.tauro at intel.com>
> ---
> drivers/gpu/drm/xe/xe_guc_engine_activity.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_engine_activity.c b/drivers/gpu/drm/xe/xe_guc_engine_activity.c
> index b96fea78df8b..0fb48f8f05d8 100644
> --- a/drivers/gpu/drm/xe/xe_guc_engine_activity.c
> +++ b/drivers/gpu/drm/xe/xe_guc_engine_activity.c
> @@ -304,6 +304,8 @@ static void engine_activity_set_cpu_ts(struct xe_guc *guc, unsigned int index)
> struct engine_activity_group *eag = &engine_activity->eag[index];
> int i, j;
>
> + xe_gt_assert(guc_to_gt(guc), index < engine_activity->num_activity_group);
This would happen only when num_vfs is greater than total_vfs since
num_activity_group is total_vfs+1. enable_vfs has that check> +
> for (i = 0; i < GUC_MAX_ENGINE_CLASSES; i++)
> for (j = 0; j < GUC_MAX_INSTANCES_PER_CLASS; j++)
> eag->engine[i][j].last_cpu_ts = ktime_get();
> @@ -374,8 +376,9 @@ static int engine_activity_enable_function_stats(struct xe_guc *guc, int num_vfs
> return ret;
> }
>
> - for (i = 0; i < engine_activity->num_functions; i++)
> - engine_activity_set_cpu_ts(guc, i + 1);
> + /* skip PF as it was already setup */
> + for (i = 1; i < engine_activity->num_functions; i++)
> + engine_activity_set_cpu_ts(guc, i);
sorry, missed this when removing global.
Thanks for fixing this. Assert seems unnecessary
Reviewed-by: Riana Tauro <riana.tauro at intel.com>>
> return 0;
> }
More information about the Intel-xe
mailing list