[Intel-gfx] [RFC v1] drm/i915: Add Exec param to control data port coherency.

Joonas Lahtinen joonas.lahtinen at linux.intel.com
Fri May 4 09:24:51 UTC 2018


Quoting Lis, Tomasz (2018-03-20 19:23:03)
> 
> 
> On 2018-03-19 15:26, Chris Wilson wrote:
> 
>     Quoting Lis, Tomasz (2018-03-19 14:14:19)
> 
> 
>         On 2018-03-19 13:43, Chris Wilson wrote:
> 
>             Quoting Tomasz Lis (2018-03-19 12:37:35)
> 
>                 The patch adds a parameter to control the data port coherency functionality
>                 on a per-exec call basis. When data port coherency flag value is different
>                 than what it was in previous call for the context, a command to switch data
>                 port coherency state is added before the buffer to be executed.
> 
>             So this is part of the context? Why do it at exec level?
> 
>         It is part of the context, stored within HDC chicken bit register.
>         The exec level was requested by the OCL team, due to concerns about
>         performance cost of context setparam calls.
> 
>     What? Oh dear, oh dear, thrice oh dear. The context setparam would look
>     like:
> 
>             if (arg != context->value) {
>                     rq = request_alloc(context, RCS);
>                     cs = ring_begin(rq, 4);
>                     cs++ = MI_LRI;
>                     cs++ = reg;
>                     cs++ = magic;
>                     cs++ = MI_NOOP;
>                     request_add(rq);
>                     context->value = arg
>             }
> 
>     The argument is whether stuffing it into a crowded, v.frequently
>     executed execbuf is better than an irregular setparam. If they want to
>     flip it on every batch, use execbuf. If it's going to be very
>     infrequent, setparam.
> 
> Implementing the data port coherency switch as context setparam would not be a
> problem, I agree.
> But this is not a solution OCL is willing to accept. Any additional IOCTL call
> is a concern for the OCL developers.

Being part of hardware context is a good indication that GEM context is
the right place for the bit. Stuffing more into execbuf for
one-usecase-one-platform scenario doesn't sound very future looking in
terms of overall driver development.

I would truly imagine that the IOCTL execution time should not be
meaningful compared to compute kernel execution times. If they are
already having a large amount of IOCTL calls between each patch, I guess
that is something to be discussed separately.

Regards, Joonas

> 
> For more explanation on switch frequency - please look at the cover letter I
> provided; here's the related part of it:
> (note: the data port coherency is called fine grain coherency within UMD)
> 
>     3. Will coherency switch be used frequently?
> 
>     There are scenarios that will require frequent toggling of the coherency
>     switch.
>     E.g. an application has two OCL compute kernels: kern_master and kern_worker.
>     kern_master uses, concurrently with CPU, some fine grain SVM resources
>     (CL_MEM_SVM_FINE_GRAIN_BUFFER). These resources contain descriptors of
>     computational work that needs to be executed. kern_master analyzes incoming
>     work descriptors and populates a plain OCL buffer (non-fine-grain) with payload
>     for kern_worker. Once kern_master is done, kern_worker kicks-in and processes
>     the payload that kern_master produced. These two kernels work in a loop, one
>     after another. Since only kern_master requires coherency, kern_worker should
>     not be forced to pay for it. This means that we need to have the ability to
>     toggle coherency switch on or off per each GPU submission:
>     (ENABLE COHERENCY) kern_master -> (DISABLE COHERENCY)kern_worker -> (ENABLE
>     COHERENCY) kern_master -> (DISABLE COHERENCY)kern_worker -> ...
> 
>     That discussion must be part of the rationale in the commitlog.
> 
> Will add.
> Should I place the whole text from cover letter within the commit comment?
> 
>     Otoh, execbuf3 would accept it as a command packet. Hmm.
> 
> I know we have execbuf2, but execbuf3? Are you proposing to add something like
> that?
> 
>               If exec level
>             is desired, why not whitelist it?
> 
>         If we have no issue in whitelisting the register, I'm sure OCL will
>         agree to that.
>         I assumed the whitelisting will be unacceptable because of security
>         concerns with some options.
>         The register also changes its position and content between gens, which
>         makes whitelisting hard to manage.
> 
>         Main purpose of chicken bit registers, in general, is to allow work
>         around for hardware features which could  be buggy or could have
>         unintended influence on the platform.
>         The data port coherency functionality landed there for the same reasons;
>         then it twisted itself in a way that we now need user space to switch it.
>         Is it really ok to whitelist chicken bit registers?
> 
>     It all depends on whether it breaks segregation. If the only users
>     affected are themselves, fine. Otherwise, no.
>     -Chris
> 
> Chicken Bit registers are definitely not planned as safe for use. While meaning
> of bits within HDC_CHICKEN0 change between gens, I doubt any of the registers
> *can't* be used to cause GPU hung.
> -Tomasz
> 


More information about the Intel-gfx mailing list