[Intel-gfx] [Patch v2] Add uAPI to query microcontroller fw version

Andi Shyti andi.shyti at linux.intel.com
Wed Oct 4 07:37:06 UTC 2023


Hi Vivaik,

On Tue, Oct 03, 2023 at 08:40:12PM -0700, Vivaik Balasubrawmanian wrote:
> Due to a bug in GuC firmware, Mesa can't enable by default the usage of 
> async compute engines feature in DG2 and newer. A new GuC firmware fixed the issue but 
> until now there was no way for Mesa to know if KMD was running with the fixed GuC version or not,
> so this uAPI is required.
> 
> More context on the issue:
> Vulkan allows applications to create types of queues: graphics, compute and copy.
> Today Intel Vulkan driver uses Render engine to implement all those 3 queues types.
> 
> There is a set of operations that a queue type is required to implement, 
> DG2 compute engine have almost all the operations required by compute queue but still lacks some.

/have/has/

> So the solution is to send those operations not supported by compute engine to render engine 
> and do some synchronization around it. But doing so causes the GuC scheduler to get stuck 

/to get/gets/

> around the synchronization, until KMD resets the engine and ban the application context.

/ban/bans/

> This issue was root caused to a GuC firmware issue and was fixed in newer version.
> 
> So Mesa can't enable the "async compute" without knowing for sure that KMD is running 
> with a GuC version that has the scheduler fix. Same will happen when Mesa start to use 
> copy engine.
> 
> This uAPI  may be expanded in future to query other firmware versions too.

Thanks for the explanation, it's clear and comprehensive.

Can you please wrap it to 75 characters (as per the Kernel
doc[1]) or 65 characters (as per the e-mail netiquette[2]).

[1] https://docs.kernel.org/process/submitting-patches.html#the-canonical-patch-format
[2] https://www.ietf.org/rfc/rfc1855.txt

> More information:
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661
> Mesa usage: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25233
> 
> v2:
> - incorporated feedback from Tvrtko Ursulin:
>   - updated patch description to clarify the use case that identified
>     this issue.
>   - updated query_uc_fw_version() to use copy_query_item() helper.
>   - updated the implemented GuC version query to return Submission
>     version.
> 
> Cc: John Harrison <John.C.Harrison at Intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> Cc: José Roberto de Souza <jose.souza at intel.com>
> 
> Signed-off-by: Vivaik Balasubrawmanian <vivaik.balasubrawmanian at intel.com>

Please don't leave blank lines in the tag section.

> ---
>  drivers/gpu/drm/i915/i915_query.c | 42 +++++++++++++++++++++++++++++++
>  include/uapi/drm/i915_drm.h       | 32 +++++++++++++++++++++++
>  2 files changed, 74 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
> index 00871ef99792..3e3563ab62b7 100644
> --- a/drivers/gpu/drm/i915/i915_query.c
> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -551,6 +551,47 @@ static int query_hwconfig_blob(struct drm_i915_private *i915,
>  	return hwconfig->size;
>  }
>  
> +static int
> +query_uc_fw_version(struct drm_i915_private *i915, struct drm_i915_query_item *query)
> +{
> +	struct drm_i915_query_uc_fw_version __user *query_ptr = u64_to_user_ptr(query->data_ptr);
> +	size_t size = sizeof(struct drm_i915_query_uc_fw_version);
> +	struct drm_i915_query_uc_fw_version resp;
> +	int ret;
> +
> +	ret = copy_query_item(&resp, size, size, query);
> +	if (ret == size) {
> +		query->length = size;
> +		return 0;
> +	} else if (ret != 0)
> +		return ret;

please use braces around the "else if".

> +
> +	if (resp.pad || resp.pad2 || resp.reserved) {

why do you care to check the padding?

> +		drm_dbg(&i915->drm,
> +			"Invalid input fw version query structure parameters received");

"Invalid firmware query" maybe it's a bit more understandable.

Andi
> +		return -EINVAL;
> +	}


More information about the dri-devel mailing list