[Intel-gfx] [PATCH] drm/i915: HSW GT3 Slices: exec flag to warn kernel that userspace is using predication

Chris Wilson chris at chris-wilson.co.uk
Sat Nov 2 13:54:55 CET 2013


On Sat, Nov 02, 2013 at 10:49:32AM -0200, Rodrigo Vivi wrote:
> On Fri, Nov 1, 2013 at 8:39 AM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > On Thu, Oct 31, 2013 at 09:07:09PM -0200, Rodrigo Vivi wrote:
> >> If Userspace isn't using MI_PREDICATE all slices must be enabled for
> >> backward compatibility.
> >>
> >> If I915_EXEC_USE_PREDICATE isn't set and defaul is set to half, kernel will force
> >> all slices on.
> >>
> >> v2: fix the inverted logic for backwards compatibility
> >>     USE_PREDICATE unset force gt_full when defaul is half
> >>     instead of GT_FULL flag.
> >>
> >> v3: Accepting Chris's suggestions: better variable names;
> >>     better logic around state_default x legacy_userspace_busy;
> >>     remove unecessary mutex;
> >>
> >> v4: Accepting more suggestions from Chris:
> >>     * Send all LRIs in only one block and don't ignore if it fails.
> >>     * function name and cleaner code on forcing_full.
> >>
> >> v5: fix mutex_lock use by Chris.
> >>
> >> CC: Chris Wilson <chris at chris-wilson.co.uk>
> >> CC: Eric Anholt <eric at anholt.net>
> >> CC: Kenneth Graunke <kenneth at whitecape.org>
> >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi at gmail.com>
> >
> > Locking needs major work still.
> 
> What else is wrong besides what you pointed on sysfs?

It has a ripple effect.
 
> >
> >> @@ -935,6 +985,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >>       struct drm_clip_rect *cliprects = NULL;
> >>       struct intel_ring_buffer *ring;
> >>       struct i915_ctx_hang_stats *hs;
> >> +     struct i915_gt_slices *gt_slices = &dev_priv->gt_slices;
> >>       u32 ctx_id = i915_execbuffer2_get_context_id(*args);
> >>       u32 exec_start, exec_len;
> >>       u32 mask, flags;
> >> @@ -999,6 +1050,19 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >>               return -EINVAL;
> >>       }
> >>
> >> +     if (gt_legacy_userspace(ring, args)) {
> >> +             mutex_lock(&gt_slices->lock);
> >> +             if (gt_slices->state_default == 0 &&
> >> +                 !gt_slices->legacy_userspace_busy) {
> >
> > You need to set legacy_userspace_busy if gt_slices->state_default is
> > already 0. Why 0? Why not 1? 1 - for one slice, 2 - for two slices, etc.
> 
> I used 0 for half and 1 for full like I used on the boot parameter.
> I don't mind in changing that if you believe 1 and 2 is more clear...
> but I would have to change all other patches and maybe the sysfs
> interface?

half,full seems reasonable enough, and it will be easy to extend that
interface to include an integer slice count.  Having it as a slice count
it is a lot easier to understand the implications of different values. 
I would also argue that the module parameter should match the sysfs
interface.

> 
> >
> >> +                     ret = gt_legacy_userspace_busy(ring);
> >> +                     if (ret == 0)
> >> +                             gt_slices->legacy_userspace_busy = true;
> >> +             }
> >> +             mutex_unlock(&gt_slices->lock);
> >> +             if (ret)
> >> +                     return ret;
> >> +     }
> >> +
> >>       mode = args->flags & I915_EXEC_CONSTANTS_MASK;
> >>       mask = I915_EXEC_CONSTANTS_MASK;
> >>       switch (mode) {
> >
> >> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> >> index 86ccd52..a821499 100644
> >> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> >> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> >> @@ -135,16 +135,23 @@ static ssize_t gt_slice_config_store(struct device *kdev,
> >>  {
> >>       struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
> >>       struct drm_device *dev = minor->dev;
> >> +     struct drm_i915_private *dev_priv = dev->dev_private;
> >>       int ret;
> >>
> >>       if (!strncmp(buf, "full", sizeof("full") - 1)) {
> >>               ret = intel_set_gt_full(dev);
> >>               if (ret)
> >>                       return ret;
> >> +             mutex_lock(&dev_priv->gt_slices.lock);
> >> +             dev_priv->gt_slices.state_default = 1;
> >> +             mutex_unlock(&dev_priv->gt_slices.lock);
> >>       } else if (!strncmp(buf, "half", sizeof("half") - 1)) {
> >>               ret = intel_set_gt_half(dev);
> >>               if (ret)
> >>                       return ret;
> >> +             mutex_lock(&dev_priv->gt_slices.lock);
> >> +             dev_priv->gt_slices.state_default = 0;
> >> +             mutex_unlock(&dev_priv->gt_slices.lock);
> >>       } else
> >>               return -EINVAL;
> >
> > This is the clearest example that the locking is fubar. Consider a
> > second process that simultaneously tries to change slice config. What
> > state is recorded? What state is the hardware actually in?
> 
> agree... I will just remove it, but what else you see wrong with locking?

You can't just remove the locking here either. The locking has to
protect all register access and bookkeeping so that they are always in
sync.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre



More information about the Intel-gfx mailing list