[Intel-gfx] [PATCH 01/11] drm/i915/gem: Make context persistence optional

Jason Ekstrand jason at jlekstrand.net
Tue Oct 29 16:19:09 UTC 2019


On Fri, Oct 25, 2019 at 4:29 PM Chris Wilson <chris at chris-wilson.co.uk>
wrote:

> Quoting Jason Ekstrand (2019-10-25 19:22:04)
> > On Thu, Oct 24, 2019 at 6:40 AM Chris Wilson <chris at chris-wilson.co.uk>
> wrote:
> >
> >     Our existing behaviour is to allow contexts and their GPU requests to
> >     persist past the point of closure until the requests are complete.
> This
> >     allows clients to operate in a 'fire-and-forget' manner where they
> can
> >     setup a rendering pipeline and hand it over to the display server and
> >     immediately exiting. As the rendering pipeline is kept alive until
> >     completion, the display server (or other consumer) can use the
> results
> >     in the future and present them to the user.
> >
> >     However, not all clients want this persistent behaviour and would
> prefer
> >     that the contexts are cleaned up immediately upon closure. This
> ensures
> >     that when clients are run without hangchecking, any GPU hang is
> >     terminated with the process and does not continue to hog resources.
> >
> >     By defining a context property to allow clients to control
> persistence
> >     explicitly, we can remove the blanket advice to disable hangchecking
> >     that seems to be far too prevalent.
> >
> >
> > Just to be clear, when you say "disable hangchecking" do you mean
> disabling it
> > for all processes via a kernel parameter at boot time or a sysfs entry or
> > similar?  Or is there some mechanism whereby a context can request no
> hang
> > checking?
>
> They are being told to use the module parameter i915.enable_hangcheck=0
> to globally disable hangchecking. This is what we are trying to wean
> them off, and yet still allow indefinitely long kernels. The softer
> hangcheck is focused on if you block scheduling or preemption of higher
> priority work, then you are forcibly removed from the GPU. However, even
> that is too much for some workloads, where they really do expect to
> permanently hog the GPU. (All I can say is that they better be dedicated
> systems because if you demand interactivity on top of disabling
> preemption...)
>

Ok, thinking out loud here (no need for you to respond):  Why should we
take this approach?  It seems like there are several other ways we could
solve this:

 1. Have a per-context flag (that's what we did here)
 2. Have a per-execbuf flag for "don't allow this execbuf to outlive the
process".
 3. Have a DRM_IOCTL_I915_KILL_CONTEXT which lets the client manually kill
the context

Option 2 seems like a lot more work in i915 and it doesn't seem that
advantageous.  Most drivers are going to either want their batches to
outlive them or not; they aren't going to be making that decision on a
per-batch basis.

Option 3 would work for some cases but it doesn't let the kernel terminate
work if the client is killed unexpectedly by, for instance a segfault.  The
client could try to insert a crash handler but calling a DRM ioctl from a
crash handler sounds like a bad plan.  On the other hand, the client can
just as easily implement 3 by setting the new context flag and then calling
GEM_CONTEXT_DESTROY.

With that, I think I'm convinced that a context param is the best way to do
this.  We may even consider using it in Vulkan when running headless to let
us kill stuff quicker.  We aren't seeing any long-running Vulkan compute
workloads yet but they may be coming.

Acked-by: Jason Ekstrand <jason at jlekstrand.net>


One more question: Does this bit fully support being turned on and off or
is it a set-once?  I ask because how I'd likely go about doing this in
Vulkan would be to set it on context create and then unset it the moment we
see a buffer shared with the outside world.

--Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20191029/ad338b59/attachment-0001.html>


More information about the Intel-gfx mailing list