[Intel-gfx] [PATCH] drm/i915/gen9: Increase PCODE request timeout to 100ms

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Tue Feb 21 14:16:52 UTC 2017


On 21/02/2017 14:13, Imre Deak wrote:
> On Tue, Feb 21, 2017 at 01:11:27PM +0000, Tvrtko Ursulin wrote:
>>
>> On 21/02/2017 12:43, Imre Deak wrote:
>>> On Tue, Feb 21, 2017 at 10:06:45AM +0000, Tvrtko Ursulin wrote:
>>>>
>>>> On 21/02/2017 09:37, Chris Wilson wrote:
>>>>> On Tue, Feb 21, 2017 at 11:22:12AM +0200, Imre Deak wrote:
>>>>>> On Mon, Feb 20, 2017 at 04:05:33PM +0000, Chris Wilson wrote:
>>>>>>> So that our preempt-off period doesn't grow completely unchecked, or do
>>>>>>> we need that 34ms loop?
>>>>>>
>>>>>> Yes, that's at least how I understand it. Scheduling away is what let's
>>>>>> PCODE start servicing some other request than ours or go idle. That's
>>>>>> in a way what we see when the preempt-enabled poll times out.
>>>>>
>>>>> I was thinking along the lines of if it was just busy/unavailable for the
>>>>> first 33ms that particular time, it just needed to sleep until ready.
>>>>> Once available, the next request ran in the expected 1ms.
>>>>
>>>>> Do you not see any value in trying a sleeping loop? Perhaps compromise
>>>>> and have the preempt-disable timeout increase each iteration.
>>>
>>> This fallback method would work too, but imo the worst case is what
>>> matters and that would be anyway the same in both cases. Because of this
>>> and since it's a WA I'd rather keep it simple.
>>>
>>>> Parachuting in so apologies if I misunderstood something.
>>>>
>>>> Is the issue here that we can get starved out of CPU time for more than 33ms
>>>> while waiting for an event?
>>>
>>> We need to actively resend the same request for this duration.
>>>
>>>> Could we play games with sched_setscheduler and maybe temporarily go
>>>> SCHED_DEADLINE or something? Would have to look into how to correctly
>>>> restore to the old state from that and from which contexts we can actually
>>>> end up in this wait.
>>>
>>> What would be the benefit wrt. disabling preemption? Note that since
>>> it's a workaround it would be good to keep it simple and close to how it
>>> worked on previous platforms (SKL/APL).
>>
>> It would be nicer not to relax that BUILD_BUG_ON in atomic wait for and, if
>> the main problem is the scheduler/CPU starvation, to see if it can be solved
>> differently. Even though the atomic wait here would trigger very rarely it
>> might be worth coming up with something nicer and generalized.
>>
>> If I understood it correctly, the difference between this wait_for call site
>> and the rest is that here it wants a certain number of COND checks to be
>> guaranteed?
>
> Yes.
>
>> The other call sites care more about checking on enter and exit.
>>
>> So in this case we want the period parameter to actually be guaranteed (or
>> close). This sounded like a good candidate for SCHED_DEADLINE to me. Like
>> wait_for_periodic(COND, TIMEOUT, INTERVAL).
>
> Could be. But this would give less guarantee than disabling preemption,
> as SCHED_DEADLINE still works on a best effort basis. How about
> increasing the timeout now (to 50ms) and trying what you suggest as a
> follow-up? That way we have also something for -stable.

If 50ms works it is fine by me.

Regards,

Tvrtko




More information about the Intel-gfx mailing list