[PATCH] drm/xe/guc: Check response data of CLIENT_SOFT_RESET action
Michal Wajdeczko
michal.wajdeczko at intel.com
Thu Feb 20 18:42:01 UTC 2025
On 19.02.2025 23:10, Cavitt, Jonathan wrote:
> -----Original Message-----
> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of Michal Wajdeczko
> Sent: Wednesday, February 19, 2025 11:27 AM
> To: intel-xe at lists.freedesktop.org
> Cc: Wajdeczko, Michal <Michal.Wajdeczko at intel.com>; Dan Carpenter <dan.carpenter at linaro.org>; Brost, Matthew <matthew.brost at intel.com>
> Subject: [PATCH] drm/xe/guc: Check response data of CLIENT_SOFT_RESET action
>>
>> Treat unexpected non-zero response data from GuC as -EPROTO error.
>> This will also prevent passing positive value to ERR_PTR().
>>
>> Reported-by: Dan Carpenter <dan.carpenter at linaro.org>
>> Closes: https://lore.kernel.org/intel-xe/5fd5a4b9-88ee-47f1-ac3e-32a20960c5b0@stanley.mountain/
>> Fixes: 12f95f9900c0 ("drm/xe/guc: Prefer GT oriented logs for GuC messages")
>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>> Cc: Dan Carpenter <dan.carpenter at linaro.org>
>> Cc: Matthew Brost <matthew.brost at intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_guc.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
>> index 1619c0a52db9..bb03db361512 100644
>> --- a/drivers/gpu/drm/xe/xe_guc.c
>> +++ b/drivers/gpu/drm/xe/xe_guc.c
>> @@ -1238,6 +1238,8 @@ int xe_guc_suspend(struct xe_guc *guc)
>>
>> ret = xe_guc_mmio_send(guc, action, ARRAY_SIZE(action));
>> if (ret) {
>> + if (ret > 0)
>> + ret = -EPROTO;
>
> I don't see anything wrong with this current implementation, though it's perhaps
> a bit unnecessary to nest this if statement in the "if (ret)" check.
the idea for nested 'if' was to avoid extra check on normal flow, since
it will be very unlikely that return value will non-zero.
with your approach below we will be always doing comparison twice
but yeah, this is not on critical path, so who cares...
> Maybe:
> """
> ret = xe_guc_mmio_send(guc, action, ARRAY_SIZE(action));
> if (ret > 0)
> ret = -EPROTO;
> if (ret) {
> ...
> """
> Or, if you don't mind ternary operations:
> """
> ret = xe_guc_mmio_send(guc, action, ARRAY_SIZE(action));
> ret = ret > 0 ? -EPROTO : ret;
> if (ret) {
> ...
> """
actually above pattern would be useful if move out the sending of
SOFT_RESET to separate helper that handles unexpected replies in place
and always returns only 0 or -ve error code:
int guc_send_soft_reset(guc)
{
u32 action[] = {..}
int ret;
ret = xe_guc_mmio_send(guc, action, ARRAY_SIZE(action));
return = ret > 0 ? -EPROTO : ret;
}
and this is what I'm usually doing, see guc_action_vf_control_cmd()
>
> Alternatively, given we do something similar everywhere we call
> xe_guc_mmio_send (except for in xe_guc_hwconfig.c), we might
> want to consider modifying the function to perform this -EPROTO
> step automatically:
> """
> int xe_guc_mmio_send(struct xe_guc *guc, const u32 *request, u32 len)
> {
> int ret = xe_guc_mmio_send_recv(guc, request, len, NULL);
> return ret > 0 ? -EPROTO : ret;
> }
no, as xe_guc_mmio_send() shall remain as generic purpose function since
GuC ABI does not prevent actions to return data just in DATA0 field.
and IIRC the XE_GUC_ACTION_ALLOCATE_DOORBELL, which is unused in xe yet,
is one example of such action.
> ...
> +++ b/drivers/gpu/drm/xe/xe_guc_hwconfig.c
> static int send_get_hwconfig(struct xe_guc *guc, u64 ggtt_addr, u32 size)
> {
> u32 action[] = {
> XE_GUC_ACTION_GET_HWCONFIG,
> lower_32_bits(ggtt_addr),
> upper_32_bits(ggtt_addr),
> size,
> };
>
> return xe_guc_mmio_send_recv(guc, action, ARRAY_SIZE(action), NULL);
> }
> """
> These aren't necessary changes, by the way. They're perhaps nice-to-haves,
> but I won't block on it:
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> -Jonathan Cavitt
>
>> xe_gt_err(gt, "GuC suspend failed: %pe\n", ERR_PTR(ret));
>> return ret;
>> }
>> --
>> 2.47.1
>>
>>
More information about the Intel-xe
mailing list