[Intel-gfx] [PATCH] drm/i915: tidy up request alloc

Chris Wilson chris at chris-wilson.co.uk
Fri Jul 1 18:34:51 UTC 2016


On Fri, Jul 01, 2016 at 05:58:18PM +0100, Dave Gordon wrote:
> On 30/06/16 13:49, Tvrtko Ursulin wrote:
> >
> >On 30/06/16 11:22, Chris Wilson wrote:
> >>On Thu, Jun 30, 2016 at 09:50:20AM +0100, Tvrtko Ursulin wrote:
> >>>
> >>>On 30/06/16 02:35, Hong Liu wrote:
> >>>>Return the allocated request pointer directly to remove
> >>>>the double pointer parameter.
> >>>>
> >>>>Signed-off-by: Hong Liu <hong.liu at intel.com>
> >>>>---
> >>>>  drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------------
> >>>>  1 file changed, 7 insertions(+), 18 deletions(-)
> >>>>
> >>>>diff --git a/drivers/gpu/drm/i915/i915_gem.c
> >>>>b/drivers/gpu/drm/i915/i915_gem.c
> >>>>index 1d98782..9881455 100644
> >>>>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>>>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>>>@@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct kref
> >>>>*req_ref)
> >>>>      kmem_cache_free(req->i915->requests, req);
> >>>>  }
> >>>>
> >>>>-static inline int
> >>>>+static inline struct drm_i915_gem_request *
> >>>>  __i915_gem_request_alloc(struct intel_engine_cs *engine,
> >>>>-             struct i915_gem_context *ctx,
> >>>>-             struct drm_i915_gem_request **req_out)
> >>>>+             struct i915_gem_context *ctx)
> >>>>  {
> >>>>      struct drm_i915_private *dev_priv = engine->i915;
> >>>>      unsigned reset_counter =
> >>>>i915_reset_counter(&dev_priv->gpu_error);
> >>>>      struct drm_i915_gem_request *req;
> >>>>      int ret;
> >>>>
> >>>>-    if (!req_out)
> >>>>-        return -EINVAL;
> >>>>-
> >>>>-    *req_out = NULL;
> >>>>-
> >>>>      /* ABI: Before userspace accesses the GPU (e.g. execbuffer),
> >>>>report
> >>>>       * EIO if the GPU is already wedged, or EAGAIN to drop the
> >>>>struct_mutex
> >>>>       * and restart.
> >>>>       */
> >>>>      ret = i915_gem_check_wedge(reset_counter,
> >>>>dev_priv->mm.interruptible);
> >>>>      if (ret)
> >>>>-        return ret;
> >>>>+        return ERR_PTR(ret);
> >>>>
> >>>>      req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
> >>>>      if (req == NULL)
> >>>>-        return -ENOMEM;
> >>>>+        return ERR_PTR(-ENOMEM);
> >>>>
> >>>>      ret = i915_gem_get_seqno(engine->i915, &req->seqno);
> >>>>      if (ret)
> >>>>@@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct
> >>>>intel_engine_cs *engine,
> >>>>      if (ret)
> >>>>          goto err_ctx;
> >>>>
> >>>>-    *req_out = req;
> >>>>-    return 0;
> >>>>+    return req;
> >>>>
> >>>>  err_ctx:
> >>>>      i915_gem_context_unreference(ctx);
> >>>>  err:
> >>>>      kmem_cache_free(dev_priv->requests, req);
> >>>>-    return ret;
> >>>>+    return ERR_PTR(ret);
> >>>>  }
> >>>>
> >>>>  /**
> >>>>@@ -3067,13 +3060,9 @@ struct drm_i915_gem_request *
> >>>>  i915_gem_request_alloc(struct intel_engine_cs *engine,
> >>>>                 struct i915_gem_context *ctx)
> >>>>  {
> >>>>-    struct drm_i915_gem_request *req;
> >>>>-    int err;
> >>>>-
> >>>>      if (ctx == NULL)
> >>>>          ctx = engine->i915->kernel_context;
> >>>>-    err = __i915_gem_request_alloc(engine, ctx, &req);
> >>>>-    return err ? ERR_PTR(err) : req;
> >>>>+    return __i915_gem_request_alloc(engine, ctx);
> >>>>  }
> >>>>
> >>>>  struct drm_i915_gem_request *
> >>>>
> >>>
> >>>Looks good to me. And have this feeling I've seen this somewhere before.
> >>
> >>Several times. This is not the full tidy, nor does it realise the
> >>ramifactions of request alloc through the stack.
> >
> >Hm I can't spot that it is doing anything wrong or making anything
> >worse. You don't want to let the small cleanup in?
> >
> >Regards,
> >Tvrtko
> 
> It ought to make almost no difference, because the *only* place the
> inner function is called is from the outer one, which passes a
> pointer to a local for the returned object; and the inner one is
> then inlined, so the compiler doesn't actually put it on the stack
> and call to the inner allocator anyway.
> 
> Strangely, however, with this change the code becomes ~400 bytes bigger!
> 
> Disassembly reveals that while the code for the externally-callable
> outer function is indeed almost identical, a second copy of it has
> also been inlined at the one callsite in this file:
> 
> __i915_gem_object_sync() ...
> 	req = i915_gem_request_alloc(to, NULL);
> 
> I don't think that's a critical path and would rather have 400 bytes
> smaller codespace. We can get that back by adding /noinline/ to the
> outer function i915_gem_request_alloc() (not, of course, to the
> inner one, that definitely *should* be inline).

__i915_gem_object_sync() should not be calling i915_gem_request_alloc().

That's the issue with this patch, your patch and John's patch.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list