[PATCH 07/17] drm/xe/oa: OA stream initialization (OAG)

Tue May 28 06:17:47 UTC 2024

On Mon, 27 May 2024 22:47:13 -0700, Lionel Landwerlin wrote:
>

Hi Lionel,

> On 28/05/2024 08:27, Dixit, Ashutosh wrote:
> > On Mon, 27 May 2024 00:04:21 -0700, Lionel Landwerlin wrote:
> >
> >>> +static int xe_oa_stream_init(struct xe_oa_stream *stream,
> >>> +			     struct xe_oa_open_param *param)
> >>> +{
> > /snip/
> >
> >>> +	stream->k_exec_q = xe_exec_queue_create(stream->oa->xe, NULL,
> >>> +						BIT(stream->hwe->logical_instance), 1,
> >>> +						stream->hwe, EXEC_QUEUE_FLAG_KERNEL, 0);
> >>
> >> Hi Ashutosh,
> >>
> >> On i915 the changes of configuration were pipelined in the application's
> >> execution just like any other submission.
> >>
> >> Creating another queue completely unsynchronized from the application's
> >> submissions makes this non usable in my opinion.
> > As we discussed previously, the plan here is to provide a drm_xe_sync array,
> > through stream properties, which can use to synchronize OA programming with
> > workload submisson.
> >
> > Would that not work? If not, we can do what was done in i915. But note that
> > i915 still has unresolved hangs, which I believe are due to the spinner
> > running on the application engine (iirc repeatedly opening/closing an OA
> > stream will hang in i915, though it could be due to other i915
> > complexity). That is why thought using drm_xe_sync array is both safer and
> > more standard way of doing what we want to achieve.
> >
> > Basically the output sync object will be signalled after registers are
> > programmed and also any additional OA programming delay (which is
> > implemented in i915 using the spinner).
> >
> > This would be done both for OA stream open and changing OA stream
> > configuration.

That is true. But now that I have other stuff like gpuvis wrapped up, I
plan to start looking these couple of missing uapi pieces (hold preemption
and synchronization, likely in that order).

Because synchronization is not implemented I add the delay below:

        static int xe_oa_emit_oa_config(struct xe_oa_stream *stream)
        {
        #define NOA_PROGRAM_ADDITIONAL_DELAY_US 500
                struct xe_oa_config_bo *oa_bo;
                int err, us = NOA_PROGRAM_ADDITIONAL_DELAY_US;

                oa_bo = xe_oa_alloc_config_buffer(stream);
                if (IS_ERR(oa_bo)) {
                        err = PTR_ERR(oa_bo);
                        goto exit;
                }

                err = xe_oa_submit_bb(stream, oa_bo->bb);

                /* Additional empirical delay needed for NOA programming after registers are written */
                usleep_range(us, 2 * us);
        exit:
                return err;
        }

I need understand this is temporaty band-aid, since it stalls the
submission pipeline and needs to be replaced by proper synchronization.

> Just letting you know, because we cannot use the current ioctl because it
> doesn't behave as we expect

You wouldn't be able to merge the Mesa PR as per the current uapi now and
then add additional Mesa patches, when we implement these couple of missing
uapi features in KMD?

Thanks.
--
Ashutosh