[Intel-gfx] [PATCH] drm/i915: optionally ban context on first hang

Ville Syrjälä ville.syrjala at linux.intel.com
Tue Sep 10 15:59:31 CEST 2013


On Tue, Sep 10, 2013 at 02:26:51PM +0100, Chris Wilson wrote:
> On Tue, Sep 10, 2013 at 04:16:50PM +0300, Mika Kuoppala wrote:
> > Current policy is to ban context if it manages to hang
> > gpu in a certain time windows. Paul Berry asked if more
> > strict policy could be available for use cases where
> > the application doesn't know if the rendering command stream
> > sent to gpu is valid or not.
> > 
> > Provide an option, flag on context creation time, to let
> > userspace to set more strict policy for handling gpu hangs for
> > this context. If context with this flag set ever hangs the gpu,
> > it will be permanently banned from accessing the GPU.
> > All subsequent batch submissions will return -EIO.
> > 
> > Requested-by: Paul Berry <stereotype441 at gmail.com>
> > Cc: Paul Berry <stereotype441 at gmail.com>
> > Cc: Ben Widawsky <ben at bwidawsk.net>
> > Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c         |    3 +++
> >  drivers/gpu/drm/i915/i915_drv.h         |    3 +++
> >  drivers/gpu/drm/i915/i915_gem.c         |    9 ++++++++-
> >  drivers/gpu/drm/i915/i915_gem_context.c |   12 +++++++++---
> >  include/uapi/drm/i915_drm.h             |    5 +++++
> >  5 files changed, 28 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 3de6050..4353458 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1003,6 +1003,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
> >  	case I915_PARAM_HAS_EXEC_HANDLE_LUT:
> >  		value = 1;
> >  		break;
> > +	case I915_PARAM_HAS_CONTEXT_BAN:
> > +		value = 1;
> > +		break;
> 
> As we add the flags, we have a better method for detecting whether the
> context accepts the flags (just request that a first-ban context be
> created and mark the failure as unsupported), and so the getparam is
> redundant.
> 
> >  struct drm_i915_gem_context_create {
> >  	/*  output: id of new context*/
> >  	__u32 ctx_id;
> >  	__u32 pad;
> > +	__u64 flags;
> >  };
> 
> I thought that the size of the ioctl was part of the ABI, but it does
> look like extending it as you have done here is valid. TIL.

Yeah, it does look like drm_ioctl() does allow it, but only for driver
ioctls. For drm core ioctls the kernel still accepts the ioctl, but it
gets the size from the kernel's ioctl->cmd. So depeding on the case the
kernel may read garbage from userspace, overwrite some other userspace
data, not touch some of the data userspace was offering, or just give
back -EFAULT. I guess that's all fine since userspace that does stuff
like that is already buggy.

-- 
Ville Syrjälä
Intel OTC



More information about the Intel-gfx mailing list