[Intel-gfx] [PATCH] drm/i915: optionally ban context on first hang
Ville Syrjälä
ville.syrjala at linux.intel.com
Tue Sep 10 15:59:31 CEST 2013
On Tue, Sep 10, 2013 at 02:26:51PM +0100, Chris Wilson wrote:
> On Tue, Sep 10, 2013 at 04:16:50PM +0300, Mika Kuoppala wrote:
> > Current policy is to ban context if it manages to hang
> > gpu in a certain time windows. Paul Berry asked if more
> > strict policy could be available for use cases where
> > the application doesn't know if the rendering command stream
> > sent to gpu is valid or not.
> >
> > Provide an option, flag on context creation time, to let
> > userspace to set more strict policy for handling gpu hangs for
> > this context. If context with this flag set ever hangs the gpu,
> > it will be permanently banned from accessing the GPU.
> > All subsequent batch submissions will return -EIO.
> >
> > Requested-by: Paul Berry <stereotype441 at gmail.com>
> > Cc: Paul Berry <stereotype441 at gmail.com>
> > Cc: Ben Widawsky <ben at bwidawsk.net>
> > Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_dma.c | 3 +++
> > drivers/gpu/drm/i915/i915_drv.h | 3 +++
> > drivers/gpu/drm/i915/i915_gem.c | 9 ++++++++-
> > drivers/gpu/drm/i915/i915_gem_context.c | 12 +++++++++---
> > include/uapi/drm/i915_drm.h | 5 +++++
> > 5 files changed, 28 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 3de6050..4353458 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1003,6 +1003,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
> > case I915_PARAM_HAS_EXEC_HANDLE_LUT:
> > value = 1;
> > break;
> > + case I915_PARAM_HAS_CONTEXT_BAN:
> > + value = 1;
> > + break;
>
> As we add the flags, we have a better method for detecting whether the
> context accepts the flags (just request that a first-ban context be
> created and mark the failure as unsupported), and so the getparam is
> redundant.
>
> > struct drm_i915_gem_context_create {
> > /* output: id of new context*/
> > __u32 ctx_id;
> > __u32 pad;
> > + __u64 flags;
> > };
>
> I thought that the size of the ioctl was part of the ABI, but it does
> look like extending it as you have done here is valid. TIL.
Yeah, it does look like drm_ioctl() does allow it, but only for driver
ioctls. For drm core ioctls the kernel still accepts the ioctl, but it
gets the size from the kernel's ioctl->cmd. So depeding on the case the
kernel may read garbage from userspace, overwrite some other userspace
data, not touch some of the data userspace was offering, or just give
back -EFAULT. I guess that's all fine since userspace that does stuff
like that is already buggy.
--
Ville Syrjälä
Intel OTC
More information about the Intel-gfx
mailing list