[PATCH 07/16] drm/xe/oa: OA stream initialization (OAG)

Dixit, Ashutosh ashutosh.dixit at intel.com
Tue Feb 13 17:04:43 UTC 2024


On Fri, 09 Feb 2024 00:14:13 -0800, Lionel Landwerlin wrote:
>

Hi Lionel,

> On 09/02/2024 09:08, Dixit, Ashutosh wrote:
> > On Thu, 08 Feb 2024 22:23:30 -0800, Lionel Landwerlin wrote:
> > Hi Lionel,
> >
> >>> +static int xe_oa_emit_oa_config(struct xe_oa_stream *stream)
> >>> +{
> >>> +#define NOA_PROGRAM_ADDITIONAL_DELAY_US 500
> >>> +	struct xe_oa_config_bo *oa_bo;
> >>> +	int err, us = NOA_PROGRAM_ADDITIONAL_DELAY_US;
> >>> +
> >>> +	oa_bo = xe_oa_alloc_config_buffer(stream);
> >>> +	if (IS_ERR(oa_bo)) {
> >>> +		err = PTR_ERR(oa_bo);
> >>> +		goto exit;
> >>> +	}
> >>> +
> >>> +	err = xe_oa_submit_bb(stream, oa_bo->bb);
> >>> +
> >>> +	/* Additional empirical delay needed for NOA programming after registers are written */
> >>> +	usleep_range(us, 2 * us);
> >> Looks like the entire oa_config emission is synchronous.
> > Yes that is indeed the case in this patchset.
> >
> >> That's a difference from i915 where we could just pipeline all the config
> >> changes with perf queries in between.
> >>
> >> If there was a mechanism to return a syncobj in this ioctl, we could do the
> >> wait from userspace and/or pipeline more submissions.
> > That is the plan. To expose syncobj's in OA properties and make also make
> > the oa_config emission asynchronous. But have not been able to get to it
> > yet (IGT's are mostly getting ready, but now we may also need to add
> > support for GPUVis before we can merge these patches, if we can't get a
> > temporary waiver).
> >
> > So the direction right now is to get the current patchset merged before
> > adding more features (like the syncobj).
>
> Any idea when that will happen?

I am assuming you are asking about the merge so: adding GPUVis support
seems to be a bit of a chore, so it will take a few (like 4+) weeks I am
thinking.

However I am hearing someone might add support in Mesa for Xe OA. If we are
able to use Mesa as an upstream consumer of the OA uapi, it might go a bit
faster. So discussions are on about this.

> I suppose you'll have to define a new ioctl for this?

Let me explain the overall idea first. Similar to 'xe-exec' we need these
two fields:

        /** @num_syncs: Amount of struct drm_xe_sync in array. */
        __u32 num_syncs;

        /** @syncs: Pointer to struct drm_xe_sync array. */
        __u64 syncs;

With these we can implement "fence wait" and "fence signal", similar to
what happens with xe_exec (and which allows xe_exec to be pipelined). That
is stream (re)configuration can wait for a fence, and will signal a fence
after stream configuration is complete (including any additional delays
after writing the registers, if an additional delay is needed).

Now we need to do this in two places:

1. When the stream is opened. In this case 'num_syncs' and 'syncs' can be
   passed in via stream properties.

2. During stream reconfiguration, that is during CONFIG ioctl on the stream
   fd (called in Xe as DRM_XE_PERF_IOCTL_CONFIG). So this ioctl will need to
   accept a struct which has { 'id', 'num_syncs' and 'syncs' } as its members.

   That reminds me, I need to make this change for the CONFIG stream fd
   ioctl now, or at least introduce reserved fields to make this happen in
   the future without breaking the uapi.

Does all this make sense?

Thanks.
--
Ashutosh

> > (Also, separately I'm trying to figure out if a delay similar to the NOA
> > programming delay is really needed when we have PES registers, the case for
> > Xe2+. Looks like it might not be, but still needs to be confirmed).
> >
> > Thanks.
> > --
> > Ashutosh
>
>


More information about the Intel-xe mailing list