[Intel-gfx] [RFC v1] drm/i915: Add Exec param to control data port coherency.
Joonas Lahtinen
joonas.lahtinen at linux.intel.com
Fri May 4 09:24:51 UTC 2018
Quoting Lis, Tomasz (2018-03-20 19:23:03)
>
>
> On 2018-03-19 15:26, Chris Wilson wrote:
>
> Quoting Lis, Tomasz (2018-03-19 14:14:19)
>
>
> On 2018-03-19 13:43, Chris Wilson wrote:
>
> Quoting Tomasz Lis (2018-03-19 12:37:35)
>
> The patch adds a parameter to control the data port coherency functionality
> on a per-exec call basis. When data port coherency flag value is different
> than what it was in previous call for the context, a command to switch data
> port coherency state is added before the buffer to be executed.
>
> So this is part of the context? Why do it at exec level?
>
> It is part of the context, stored within HDC chicken bit register.
> The exec level was requested by the OCL team, due to concerns about
> performance cost of context setparam calls.
>
> What? Oh dear, oh dear, thrice oh dear. The context setparam would look
> like:
>
> if (arg != context->value) {
> rq = request_alloc(context, RCS);
> cs = ring_begin(rq, 4);
> cs++ = MI_LRI;
> cs++ = reg;
> cs++ = magic;
> cs++ = MI_NOOP;
> request_add(rq);
> context->value = arg
> }
>
> The argument is whether stuffing it into a crowded, v.frequently
> executed execbuf is better than an irregular setparam. If they want to
> flip it on every batch, use execbuf. If it's going to be very
> infrequent, setparam.
>
> Implementing the data port coherency switch as context setparam would not be a
> problem, I agree.
> But this is not a solution OCL is willing to accept. Any additional IOCTL call
> is a concern for the OCL developers.
Being part of hardware context is a good indication that GEM context is
the right place for the bit. Stuffing more into execbuf for
one-usecase-one-platform scenario doesn't sound very future looking in
terms of overall driver development.
I would truly imagine that the IOCTL execution time should not be
meaningful compared to compute kernel execution times. If they are
already having a large amount of IOCTL calls between each patch, I guess
that is something to be discussed separately.
Regards, Joonas
>
> For more explanation on switch frequency - please look at the cover letter I
> provided; here's the related part of it:
> (note: the data port coherency is called fine grain coherency within UMD)
>
> 3. Will coherency switch be used frequently?
>
> There are scenarios that will require frequent toggling of the coherency
> switch.
> E.g. an application has two OCL compute kernels: kern_master and kern_worker.
> kern_master uses, concurrently with CPU, some fine grain SVM resources
> (CL_MEM_SVM_FINE_GRAIN_BUFFER). These resources contain descriptors of
> computational work that needs to be executed. kern_master analyzes incoming
> work descriptors and populates a plain OCL buffer (non-fine-grain) with payload
> for kern_worker. Once kern_master is done, kern_worker kicks-in and processes
> the payload that kern_master produced. These two kernels work in a loop, one
> after another. Since only kern_master requires coherency, kern_worker should
> not be forced to pay for it. This means that we need to have the ability to
> toggle coherency switch on or off per each GPU submission:
> (ENABLE COHERENCY) kern_master -> (DISABLE COHERENCY)kern_worker -> (ENABLE
> COHERENCY) kern_master -> (DISABLE COHERENCY)kern_worker -> ...
>
> That discussion must be part of the rationale in the commitlog.
>
> Will add.
> Should I place the whole text from cover letter within the commit comment?
>
> Otoh, execbuf3 would accept it as a command packet. Hmm.
>
> I know we have execbuf2, but execbuf3? Are you proposing to add something like
> that?
>
> If exec level
> is desired, why not whitelist it?
>
> If we have no issue in whitelisting the register, I'm sure OCL will
> agree to that.
> I assumed the whitelisting will be unacceptable because of security
> concerns with some options.
> The register also changes its position and content between gens, which
> makes whitelisting hard to manage.
>
> Main purpose of chicken bit registers, in general, is to allow work
> around for hardware features which could be buggy or could have
> unintended influence on the platform.
> The data port coherency functionality landed there for the same reasons;
> then it twisted itself in a way that we now need user space to switch it.
> Is it really ok to whitelist chicken bit registers?
>
> It all depends on whether it breaks segregation. If the only users
> affected are themselves, fine. Otherwise, no.
> -Chris
>
> Chicken Bit registers are definitely not planned as safe for use. While meaning
> of bits within HDC_CHICKEN0 change between gens, I doubt any of the registers
> *can't* be used to cause GPU hung.
> -Tomasz
>
More information about the Intel-gfx
mailing list