[Intel-gfx] [RFC 0/8] Force preemption
chris at chris-wilson.co.uk
Thu Mar 22 15:35:19 UTC 2018
Quoting Jeff McGee (2018-03-22 14:34:58)
> On Thu, Mar 22, 2018 at 09:28:00AM +0000, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-03-22 09:22:55)
> > >
> > > On 21/03/2018 17:26, jeff.mcgee at intel.com wrote:
> > > > From: Jeff McGee <jeff.mcgee at intel.com>
> > > >
> > > > Force preemption uses engine reset to enforce a limit on the time
> > > > that a request targeted for preemption can block. This feature is
> > > > a requirement in automotive systems where the GPU may be shared by
> > > > clients of critically high priority and clients of low priority that
> > > > may not have been curated to be preemption friendly. There may be
> > > > more general applications of this feature. I'm sharing as an RFC to
> > > > stimulate that discussion and also to get any technical feedback
> > > > that I can before submitting to the product kernel that needs this.
> > > > I have developed the patches for ease of rebase, given that this is
> > > > for the moment considered a non-upstreamable feature. It would be
> > > > possible to refactor hangcheck to fully incorporate force preemption
> > > > as another tier of patience (or impatience) with the running request.
> > >
> > > Sorry if it was mentioned elsewhere and I missed it - but does this work
> > > only with stateless clients - or in other words, what would happen to
> > > stateful clients which would be force preempted? Or the answer is we
> > > don't care since they are misbehaving?
> > They get notified of being guilty for causing a gpu reset; three strikes
> > and they are out (banned from using the gpu) using the current rules.
> > This is a very blunt hammer that requires the rest of the system to be
> > robust; one might argue time spent making the system robust would be
> > better served making sure that the timer never expired in the first place
> > thereby eliminating the need for a forced gpu reset.
> > -Chris
> Yes, for simplication the policy applied to force preempted contexts
> is the same as for hanging contexts. It is known that this feature
> should not be required in a fully curated system. It's a requirement
> if end user will be alllowed to install 3rd party apps to run in the
> non-critical domain.
Third party code is still mediated by our userspace drivers, or are you
contemplating scenarios where they talk directly to ioctls? How hostile
do we have to contend with, i.e. survive a gpu fork bomb?
More information about the Intel-gfx