[Intel-gfx] [RFC v1] drm/i915: Add Exec param to control data port coherency.

Lis, Tomasz tomasz.lis at intel.com
Tue Mar 20 17:23:03 UTC 2018



On 2018-03-19 15:26, Chris Wilson wrote:
> Quoting Lis, Tomasz (2018-03-19 14:14:19)
>>
>> On 2018-03-19 13:43, Chris Wilson wrote:
>>> Quoting Tomasz Lis (2018-03-19 12:37:35)
>>>> The patch adds a parameter to control the data port coherency functionality
>>>> on a per-exec call basis. When data port coherency flag value is different
>>>> than what it was in previous call for the context, a command to switch data
>>>> port coherency state is added before the buffer to be executed.
>>> So this is part of the context? Why do it at exec level?
>> It is part of the context, stored within HDC chicken bit register.
>> The exec level was requested by the OCL team, due to concerns about
>> performance cost of context setparam calls.
> What? Oh dear, oh dear, thrice oh dear. The context setparam would look
> like:
>
> 	if (arg != context->value) {
> 		rq = request_alloc(context, RCS);
> 		cs = ring_begin(rq, 4);
> 		cs++ = MI_LRI;
> 		cs++ = reg;
> 		cs++ = magic;
> 		cs++ = MI_NOOP;
> 		request_add(rq);
> 		context->value = arg
> 	}
>
> The argument is whether stuffing it into a crowded, v.frequently
> executed execbuf is better than an irregular setparam. If they want to
> flip it on every batch, use execbuf. If it's going to be very
> infrequent, setparam.
Implementing the data port coherency switch as context setparam would 
not be a problem, I agree.
But this is not a solution OCL is willing to accept. Any additional 
IOCTL call is a concern for the OCL developers.

For more explanation on switch frequency - please look at the cover 
letter I provided; here's the related part of it:
(note: the data port coherency is called fine grain coherency within UMD)

    3. Will coherency switch be used frequently?

    There are scenarios that will require frequent toggling of the coherency
    switch.
    E.g. an application has two OCL compute kernels: kern_master and kern_worker.
    kern_master uses, concurrently with CPU, some fine grain SVM resources
    (CL_MEM_SVM_FINE_GRAIN_BUFFER). These resources contain descriptors of
    computational work that needs to be executed. kern_master analyzes incoming
    work descriptors and populates a plain OCL buffer (non-fine-grain) with payload
    for kern_worker. Once kern_master is done, kern_worker kicks-in and processes
    the payload that kern_master produced. These two kernels work in a loop, one
    after another. Since only kern_master requires coherency, kern_worker should
    not be forced to pay for it. This means that we need to have the ability to
    toggle coherency switch on or off per each GPU submission:
    (ENABLE COHERENCY) kern_master -> (DISABLE COHERENCY)kern_worker -> (ENABLE
    COHERENCY) kern_master -> (DISABLE COHERENCY)kern_worker -> ...

> That discussion must be part of the rationale in the commitlog.
Will add.
Should I place the whole text from cover letter within the commit comment?
> Otoh, execbuf3 would accept it as a command packet. Hmm.
I know we have execbuf2, but execbuf3? Are you proposing to add 
something like that?
>>>    If exec level
>>> is desired, why not whitelist it?
>> If we have no issue in whitelisting the register, I'm sure OCL will
>> agree to that.
>> I assumed the whitelisting will be unacceptable because of security
>> concerns with some options.
>> The register also changes its position and content between gens, which
>> makes whitelisting hard to manage.
>>
>> Main purpose of chicken bit registers, in general, is to allow work
>> around for hardware features which couldĀ  be buggy or could have
>> unintended influence on the platform.
>> The data port coherency functionality landed there for the same reasons;
>> then it twisted itself in a way that we now need user space to switch it.
>> Is it really ok to whitelist chicken bit registers?
> It all depends on whether it breaks segregation. If the only users
> affected are themselves, fine. Otherwise, no.
> -Chris
Chicken Bit registers are definitely not planned as safe for use. While 
meaning of bits within HDC_CHICKEN0 change between gens, I doubt any of 
the registers *can't* be used to cause GPU hung.
-Tomasz

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20180320/3462bb31/attachment-0001.html>


More information about the Intel-gfx mailing list