[Intel-xe] [PATCH v2 10/31] drm/xe/guc: Return the lower part of blocking H2G message

Michal Wajdeczko michal.wajdeczko at intel.com
Mon May 8 09:20:48 UTC 2023



On 08.05.2023 03:10, Matthew Brost wrote:
> On Fri, May 05, 2023 at 02:52:45PM -0400, Rodrigo Vivi wrote:
>> On Mon, May 01, 2023 at 05:17:06PM -0700, Matthew Brost wrote:
>>> The upper layers may need this data, an example of this is allocating
>>> DIST doorbell.
>>>
>>> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>>> ---
>>>  drivers/gpu/drm/xe/xe_guc_ct.c | 6 +++++-
>>>  drivers/gpu/drm/xe/xe_guc_pc.c | 6 ++++--
>>>  drivers/gpu/drm/xe/xe_huc.c    | 2 +-
>>>  3 files changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
>>> index 6abf1dee95af..60b69fcfac9f 100644
>>> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
>>> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
>>> @@ -25,6 +25,7 @@
>>>  struct g2h_fence {
>>>  	u32 *response_buffer;
>>>  	u32 seqno;
>>> +	u32 status;

if purpose of this field is to hold data from the success reply, then
why it is called as misleading 'status' ?

in our spec we call it 'data0' see [1]

[1] https://www.kernel.org/doc/html/latest/gpu/i915.html?#hxg-response

>>>  	u16 response_len;
>>>  	u16 error;
>>>  	u16 hint;

btw, is there any true benefit to decompose HXG replies into these
fields in g2h_fence ? done/retry/fail modes are mutually exclusive,
while separate flags here can't guarantee that


>>> @@ -727,7 +728,7 @@ static int guc_ct_send_recv(struct xe_guc_ct *ct, const u32 *action, u32 len,
>>>  		ret = -EIO;
>>>  	}
>>>  
>>> -	return ret > 0 ? 0 : ret;
>>> +	return ret > 0 ? g2h_fence.status : ret;
>>
>> The problem I see here is how the upper level could differentiate
>> between and error and a status.
>>
> 
> g2h_fence.status is 16 (can't be negative), so 0 or greater is a good
> return.

to be precise, data0 from the success reply is 28-bit (see [1] above)

you probably mixed that with 16-bit error from failure reply (see [2])

[2] https://www.kernel.org/doc/html/latest/gpu/i915.html?#hxg-failure

> 
>> should we convert the functions to have an &status argument passed in?

it is not 'status'

>>
> 
> I like it the way it is but don't really care either way. I'll change
> this if you like.

HXG spec allows reply messages longer than 1dw (1dw = HXG header only)
thus if we claim support for such use case, as we do accept
response_buffer, then we should return actual length of the received
reply, rather than just data0, to allow caller parse flex size replies:

 * Return: Non-negative response length (in dwords) or
 *         a negative error code on failure.

but since majority, if not all, of our currently defined H2G defines
just data0, without extra data1, data2, ... then for such cases we
should rather provide separate wrapper that will make sure reply size is
exactly 1 (header only) and return extracted data0 from it.

and since data0 is just 28-bit we can still use single return value
documented as:

 * Return: Non-negative data0 from the success response or
 *         a negative error code on failure.

(as I still hope proper documentation will be prepared for core
functions near term, when it's most desired)

> 
> Matt 
> 
>>>  }
>>>  
>>>  int xe_guc_ct_send_recv(struct xe_guc_ct *ct, const u32 *action, u32 len,
>>> @@ -793,6 +794,9 @@ static int parse_g2h_response(struct xe_guc_ct *ct, u32 *msg, u32 len)
>>>  		g2h_fence->response_len = response_len;
>>>  		memcpy(g2h_fence->response_buffer, msg + GUC_CTB_MSG_MIN_LEN,
>>>  		       response_len * sizeof(u32));
>>> +	} else {
>>> +		g2h_fence->status =
>>> +			FIELD_GET(GUC_HXG_RESPONSE_MSG_0_DATA0, msg[1]);
>>>  	}
>>>  
>>>  	g2h_release_space(ct, GUC_CTB_HXG_MSG_MAX_LEN);
>>> diff --git a/drivers/gpu/drm/xe/xe_guc_pc.c b/drivers/gpu/drm/xe/xe_guc_pc.c
>>> index 72d460d5323b..3d2ea723a4a7 100644
>>> --- a/drivers/gpu/drm/xe/xe_guc_pc.c
>>> +++ b/drivers/gpu/drm/xe/xe_guc_pc.c
>>> @@ -204,11 +204,13 @@ static int pc_action_query_task_state(struct xe_guc_pc *pc)
>>>  
>>>  	/* Blocking here to ensure the results are ready before reading them */
>>>  	ret = xe_guc_ct_send_block(ct, action, ARRAY_SIZE(action));
>>> -	if (ret)
>>> +	if (ret < 0) {

since almost all H2G expects data0 == 0 on success, this fix shouldn't
be necessary, unless you are hiding other issue ... which in this case
is lack of proper initialization of the 'status/data0' field in
g2h_fence_init()

and btw, long term we should likely be better prepared for more
unexpected replies from the GuC, including non-zero data0 replies ...
but that should be another patch anyway

>>>  		drm_err(&pc_to_xe(pc)->drm,
>>>  			"GuC PC query task state failed: %pe", ERR_PTR(ret));
>>> +		return ret;
>>> +	}
>>>  
>>> -	return ret;
>>> +	return 0;
>>>  }
>>>  
>>>  static int pc_action_set_param(struct xe_guc_pc *pc, u8 id, u32 value)
>>> diff --git a/drivers/gpu/drm/xe/xe_huc.c b/drivers/gpu/drm/xe/xe_huc.c
>>> index 55dcaab34ea4..9c48c3075410 100644
>>> --- a/drivers/gpu/drm/xe/xe_huc.c
>>> +++ b/drivers/gpu/drm/xe/xe_huc.c
>>> @@ -39,7 +39,7 @@ int xe_huc_init(struct xe_huc *huc)
>>>  
>>>  	huc->fw.type = XE_UC_FW_TYPE_HUC;
>>>  	ret = xe_uc_fw_init(&huc->fw);
>>> -	if (ret)
>>> +	if (ret < 0)

ditto

>>>  		goto out;
>>>  
>>>  	xe_uc_fw_change_status(&huc->fw, XE_UC_FIRMWARE_LOADABLE);
>>> -- 
>>> 2.34.1
>>>



More information about the Intel-xe mailing list