[Intel-gfx] [PATCH v3] drm/i915: Add fault injection support
Imre Deak
imre.deak at intel.com
Tue Mar 15 14:01:14 UTC 2016
On ti, 2016-03-15 at 09:56 +0100, Daniel Vetter wrote:
> On Mon, Mar 14, 2016 at 04:59:20PM +0200, Imre Deak wrote:
> > Add support for forcing an error at selected places in the driver.
> > As an
> > example add 4 options to fail during driver loading.
> >
> > Requested by Chris.
> >
> > v2:
> > - Add fault point for modeset initialization
> > - Print debug message when injecting an error
> > v3:
> > - Rename inject_fault to inject_load_failure, rename the related
> > macros
> > and helper accordingly (Chris)
> >
> > CC: Chris Wilson <chris at chris-wilson.co.uk>
> > Signed-off-by: Imre Deak <imre.deak at intel.com>
> > ---
> >
> > [This depends on
> > https://lists.freedesktop.org/archives/intel-gfx/2016-March/089597
> > .html]
> >
> > drivers/gpu/drm/i915/i915_dma.c | 12 ++++++++++++
> > drivers/gpu/drm/i915/i915_drv.h | 16 ++++++++++++++++
> > drivers/gpu/drm/i915/i915_params.c | 5 +++++
> > drivers/gpu/drm/i915/i915_params.h | 1 +
> > 4 files changed, 34 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c
> > b/drivers/gpu/drm/i915/i915_dma.c
> > index a5121cd..7490307 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -370,6 +370,9 @@ static int i915_load_modeset_init(struct
> > drm_device *dev)
> > struct drm_i915_private *dev_priv = dev->dev_private;
> > int ret;
> >
> > + if (i915_inject_load_failure(I915_FAIL_INIT_MODESET))
> > + return -ENODEV;
> > +
> > ret = intel_bios_init(dev_priv);
> > if (ret)
> > DRM_INFO("failed to find VBIOS tables\n");
> > @@ -950,6 +953,9 @@ static int i915_driver_init_early(struct
> > drm_i915_private *dev_priv,
> > struct intel_device_info *device_info;
> > int ret = 0;
> >
> > + if (i915_inject_load_failure(I915_FAIL_INIT_EARLY))
> > + return -ENODEV;
> > +
> > dev_priv->dev = dev;
> >
> > /* Setup the write-once "constant" device info */
> > @@ -1064,6 +1070,9 @@ static int i915_driver_init_mmio(struct
> > drm_i915_private *dev_priv)
> > struct drm_device *dev = dev_priv->dev;
> > int ret;
> >
> > + if (i915_inject_load_failure(I915_FAIL_INIT_MMIO))
> > + return -ENODEV;
> > +
> > if (i915_get_bridge_dev(dev))
> > return -EIO;
>
> Ok, thought a bit more about this, and if we really want to check all the
> load failure paths then this will become extremely verbose. Since we'd
> need to have a bitflag for every return -EFAIL.
I'm not sure if you want to check all failure paths, I think for that
the existing failslab etc. mechanisms are better suited. This new
option would be used at relatively few well defined places. The option
is a mask since Chris wanted the possibility to mix failures (which
makes sense when injecting a non-fatal failure somewhere). If he
doesn't insist on that possibility I can convert the mask option to a
counter based one identifying a single failure spot instead as you
suggest. Chris?
> So maybe another look at
> lib/fault-inject.c is warranted, plus then some macro magic that we could
> to wrap _all_ the checks in our load code like this:
>
> if (i915_load_failure(i915_get_bridge_dev(dev)))
> return -EIO;
The above will not work when we want to propagate the error and I
haven't found a nice way to do that with such a helper taking the
function call as argument. I could move the check to the call site
instead of having it inside the called function, if that's what you'd
prefer:
if ((ret = i915_load_failure(...)) < 0 ||
(ret = i915_driver_init_mmio(...)) < 0)
return ret;
Although I'm not sure if this makes things clearer. Other suggestions are
welcome.
> i915_load_failure ofc needs to fail if it's argument is false, but it
> could also increment a counter every time it's called and then use that
> counter to inquire the fault injection framework with
> should_fail(i915_load_failures, counter). Abusing a counter for the size
> would allow us to easily restrict fault injection to a certain range of
> faults. We could also expose the final value of that counter in debugfs,
> so that the igt can tune itself.
Yes, I could do this provided Chris doesn't oppose the idea of the
counter based approach.
> Anyway, this is kinda a big plan, but the part I think we should ponder is
> how to make the fault injection less noise and intrusive. A macro to wrap
> existing checks seems like a good idea.
I can switch to using a counter instead of the mask and if there is a
good suggestion about the macro I can use that instead.
--Imre
More information about the Intel-gfx
mailing list