[PATCH v11 4/5] drm/xe/guc: Extract GuC error capture lists

Teres Alexis, Alan Previn alan.previn.teres.alexis at intel.com
Fri Jun 28 00:01:31 UTC 2024


So i have 2 comments. The first is trivial and consider a nit. But the second one is not trivial and
is more of an optimization. However but this second comment is on a part of the code that is purely
FW-ABI functionality that has hardenned over years and never changed / working for years on i915.
Error-capture is, by definition, not required to be a performant operation. That hunk is a low level
part of extraction that might create more problems if we modify it. That said i am labelling that a
nit too. Thus:

Reviewed-by: Alan Previn <alan.previn.teres.alexis at intel.com>


On Mon, 2024-06-24 at 14:54 -0700, Zhanjun Dong wrote:
> Upon the G2H Notify-Err-Capture event, parse through the
> GuC Log Buffer (error-capture-subregion) and generate one or
> more capture-nodes. A single node represents a single "engine-
> instance-capture-dump" and contains at least 3 register lists:
alan:snip

> +
> +/*

alan: nit: "DOC: Init, G2H-event and reporting flows for GuC-error-capture" would be nice to add
here.
> + * KMD Init time flows:
> + * --------------------
> + *     --> alloc A: GuC input capture regs lists (registered to GuC via ADS).
> + *                  xe_guc_ads acquires the register lists by calling
> + *                  xe_guc_capture_list_size and xe_guc_capture_list_get 'n' times,
> + *                  where n = 1 for global-reg-list +
> + *                            num_engine_classes for class-reg-list

alan:snip
> +static int
> +guc_capture_log_remove_dw(struct xe_guc *guc, struct __guc_capture_bufstate *buf,
> +                         u32 *dw)
> +{
> +       int tries = 2;
> +       int avail = 0;
> +
> +       if (!guc_capture_buf_cnt(buf))
> +               return 0;
> +
> +       while (tries--) {
> +               avail = guc_capture_buf_cnt_to_end(buf);
> +               if (avail >= sizeof(u32)) {
> +                       *dw = xe_map_rd(guc_to_xe(guc), &guc->log.bo->vmap,
> +                                       buf->data_offset + buf->rd, u32);
> +                       buf->rd += 4;
> +                       return 4;
> +               }
> +               if (avail)
> +                       xe_gt_dbg(guc_to_gt(guc), "Register capture log not dword aligned,
> skipping.\n");
> +               buf->rd = 0;
> +       }
> +
> +       return 0;
> +}
alan: I'm labelling this a nit because its old legacy code on FW-ABI
code that has been working on i915 multiple years and I believe we
rather not change. Anyway, after relooking at this blast from my past,
i realize that the function guc_capture_log_remove_dw could be
modified to do multiple bytes once once so the callers dont have to call it repeatedly for each
dword-member of srtuct being extracted.

static int
guc_capture_log_remove_bytes(struct xe_guc *guc, struct __guc_capture_bufstate *buf,
			  void *out, int bytes_needed)
{
	int full_size = 0, copy_size, avail;

	if (!guc_capture_buf_cnt(buf))
		return 0;

	while (bytes_needed) {
		avail = guc_capture_buf_cnt_to_end(buf);
		/* wrap if at end */
		if (!avail) {
			/* output stream clipped */
			if (!buf->rd)
				return full_size;
			buf->rd = 0;
			continue;
		}
		copy_size = avail < bytes_needed ? avail : bytes_needed;
		if (copy_size%4) {
			xe_gt_dbg(guc_to_gt(guc), "Bytes extraction not dword aligned,
skipping.\n");
		} else {
			xe_map_memcpy_from(guc_to_xe(guc), out, &guc->log.bo->vmap,
						b->data_offset + b->rd, copy_size);
		}
		buf->rd += copy_size;
		full_size += copy_size;
		bytes_needed =- copy_size;
	}

	return full_size;
}

with something like above, caller would only need to call once when
dealing with data straddling the wraparound:

old caller:
	read += guc_capture_log_remove_dw(guc, buf, &ghdr->owner);
	read += guc_capture_log_remove_dw(guc, buf, &ghdr->info);

new caller
	read += guc_capture_log_remove_bytes(guc, buf, &ghdr, sizeof(ghdr);

alan:snip



More information about the Intel-xe mailing list