[Intel-gfx] [PATCH v4] drm/i915: Add IOCTL Param to control data port coherency.

Wed Jul 11 09:28:56 UTC 2018

On 10/07/2018 18:32, Lis, Tomasz wrote:
> On 2018-07-09 18:37, Chris Wilson wrote:
>> Quoting Tvrtko Ursulin (2018-07-09 17:28:02)
>>> On 09/07/2018 14:20, Tomasz Lis wrote:
>>>> +static int i915_gem_context_clear_data_port_coherent(struct i915_gem_context *ctx)
>>>> +{
>>>> +     int ret;
>>>> +
>>>> +     ret = intel_lr_context_modify_data_port_coherency(ctx, false);
>>>> +     if (!ret)
>>>> +             __clear_bit(CONTEXT_DATA_PORT_COHERENT, &ctx->flags);
>>>> +     return ret;
>>> Is there a good reason you allow userspace to keep emitting unlimited
>>> number of commands which actually do not change the status? If not
>>> please consider gating the command emission with
>>> test_and_set_bit/test_and_clear_bit. Hm.. apart even with that they
>>> could keep toggling ad infinitum with no real work in between. Has it
>>> been considered to only save the desired state in set param and then
>>> emit it, if needed, before next execbuf? Minor thing in any case, just
>>> curious since I wasn't following the threads.
>> The first patch tried to add a bit to execbuf, and having been
>> mistakenly down that road before, we asked if there was any alternative.
>> (Now if you've also been following execbuf3 conversations, having a
>> packet for privileged LRI is definitely something we want.)
>>
>> Setting the value in the context register is precisely what we want to
>> do, and trivially serialised with execbuf since we have to serialise
>> reservation of ring space, i.e. the normal rules of request generation.
>> (execbuf is just a client and nothing special). From that point of view,
>> we only care about frequency, if it is very frequent it should be
>> controlled by userspace inside the batch (but it can't due to there
>> being dangerous bits inside the reg aiui). At the other end of the
>> scale, is context_setparam for set-once. And there should be no
>> inbetween as that requires costly batch flushes.
>> -Chris
> Joonas did brought that concern in his review; here it is, with my response:
> 
> On 2018-06-21 15:47, Lis, Tomasz wrote:
>> On 2018-06-21 08:39, Joonas Lahtinen wrote:
>>> I'm thinking we should set this value when it has changed, when we 
>>> insert the
>>> requests into the command stream. So if you change back and forth, while
>>> not emitting any requests, nothing really happens. If you change the 
>>> value and
>>> emit a request, we should emit a LRI before the jump to the commands.
>>> Similary if you keep setting the value to the value it already was in,
>>> nothing will happen, again.
>> When I considered that, my way of reasoning was:
>> If we execute the flag changing buffer right away, it may be sent to 
>> hardware faster if there is no job in progress.
>> If we use the lazy way, and trigger the change just before submission 
>> -  there will be additional conditions in submission code, plus the 
>> change will be made when there is another job pending (though it's not 
>> a considerable payload to just switch a flag).
>> If user space switches the flag back and forth without much sense, 
>> then there is something wrong with the user space driver, and it 
>> shouldn't be up to kernel to fix that.
>>
>> This is why I chosen the current approach. But I can change it if you 
>> wish.
> 
> So while I think the current solution is better from performance 
> standpoint, but I will change it if you request that.

Sounds like an interesting dilemma and I can see both arguments.

But for me I still prefer the option where coherency programming is 
emitted lazily on state change only. We do emit a bunch of pipe controls 
to invalidate caches and such as preamble to every request so that fits 
nicely. Advantage I see is that the set param ioctl remains very light 
and doesn't do any command submission, keeping in spirit and expectation 
with all current parameters. It makes the ioctl much quicker and as a 
secondary benefit it protects userspace form their own sillyness.

Regards,

Tvrtko