[Intel-gfx] [RFC 0/8] Force preemption

Jeff McGee jeff.mcgee at intel.com
Thu Mar 22 14:34:58 UTC 2018

On Thu, Mar 22, 2018 at 09:28:00AM +0000, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-03-22 09:22:55)
> > 
> > On 21/03/2018 17:26, jeff.mcgee at intel.com wrote:
> > > From: Jeff McGee <jeff.mcgee at intel.com>
> > > 
> > > Force preemption uses engine reset to enforce a limit on the time
> > > that a request targeted for preemption can block. This feature is
> > > a requirement in automotive systems where the GPU may be shared by
> > > clients of critically high priority and clients of low priority that
> > > may not have been curated to be preemption friendly. There may be
> > > more general applications of this feature. I'm sharing as an RFC to
> > > stimulate that discussion and also to get any technical feedback
> > > that I can before submitting to the product kernel that needs this.
> > > I have developed the patches for ease of rebase, given that this is
> > > for the moment considered a non-upstreamable feature. It would be
> > > possible to refactor hangcheck to fully incorporate force preemption
> > > as another tier of patience (or impatience) with the running request.
> > 
> > Sorry if it was mentioned elsewhere and I missed it - but does this work 
> > only with stateless clients - or in other words, what would happen to 
> > stateful clients which would be force preempted? Or the answer is we 
> > don't care since they are misbehaving?
> They get notified of being guilty for causing a gpu reset; three strikes
> and they are out (banned from using the gpu) using the current rules.
> This is a very blunt hammer that requires the rest of the system to be
> robust; one might argue time spent making the system robust would be
> better served making sure that the timer never expired in the first place
> thereby eliminating the need for a forced gpu reset.
> -Chris

Yes, for simplication the policy applied to force preempted contexts
is the same as for hanging contexts. It is known that this feature
should not be required in a fully curated system. It's a requirement
if end user will be alllowed to install 3rd party apps to run in the
non-critical domain.

More information about the Intel-gfx mailing list