[Intel-gfx] [PATCH] drm/i915: tidy up request alloc
Chris Wilson
chris at chris-wilson.co.uk
Fri Jul 1 18:34:51 UTC 2016
On Fri, Jul 01, 2016 at 05:58:18PM +0100, Dave Gordon wrote:
> On 30/06/16 13:49, Tvrtko Ursulin wrote:
> >
> >On 30/06/16 11:22, Chris Wilson wrote:
> >>On Thu, Jun 30, 2016 at 09:50:20AM +0100, Tvrtko Ursulin wrote:
> >>>
> >>>On 30/06/16 02:35, Hong Liu wrote:
> >>>>Return the allocated request pointer directly to remove
> >>>>the double pointer parameter.
> >>>>
> >>>>Signed-off-by: Hong Liu <hong.liu at intel.com>
> >>>>---
> >>>> drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------------
> >>>> 1 file changed, 7 insertions(+), 18 deletions(-)
> >>>>
> >>>>diff --git a/drivers/gpu/drm/i915/i915_gem.c
> >>>>b/drivers/gpu/drm/i915/i915_gem.c
> >>>>index 1d98782..9881455 100644
> >>>>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>>>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>>>@@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct kref
> >>>>*req_ref)
> >>>> kmem_cache_free(req->i915->requests, req);
> >>>> }
> >>>>
> >>>>-static inline int
> >>>>+static inline struct drm_i915_gem_request *
> >>>> __i915_gem_request_alloc(struct intel_engine_cs *engine,
> >>>>- struct i915_gem_context *ctx,
> >>>>- struct drm_i915_gem_request **req_out)
> >>>>+ struct i915_gem_context *ctx)
> >>>> {
> >>>> struct drm_i915_private *dev_priv = engine->i915;
> >>>> unsigned reset_counter =
> >>>>i915_reset_counter(&dev_priv->gpu_error);
> >>>> struct drm_i915_gem_request *req;
> >>>> int ret;
> >>>>
> >>>>- if (!req_out)
> >>>>- return -EINVAL;
> >>>>-
> >>>>- *req_out = NULL;
> >>>>-
> >>>> /* ABI: Before userspace accesses the GPU (e.g. execbuffer),
> >>>>report
> >>>> * EIO if the GPU is already wedged, or EAGAIN to drop the
> >>>>struct_mutex
> >>>> * and restart.
> >>>> */
> >>>> ret = i915_gem_check_wedge(reset_counter,
> >>>>dev_priv->mm.interruptible);
> >>>> if (ret)
> >>>>- return ret;
> >>>>+ return ERR_PTR(ret);
> >>>>
> >>>> req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
> >>>> if (req == NULL)
> >>>>- return -ENOMEM;
> >>>>+ return ERR_PTR(-ENOMEM);
> >>>>
> >>>> ret = i915_gem_get_seqno(engine->i915, &req->seqno);
> >>>> if (ret)
> >>>>@@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct
> >>>>intel_engine_cs *engine,
> >>>> if (ret)
> >>>> goto err_ctx;
> >>>>
> >>>>- *req_out = req;
> >>>>- return 0;
> >>>>+ return req;
> >>>>
> >>>> err_ctx:
> >>>> i915_gem_context_unreference(ctx);
> >>>> err:
> >>>> kmem_cache_free(dev_priv->requests, req);
> >>>>- return ret;
> >>>>+ return ERR_PTR(ret);
> >>>> }
> >>>>
> >>>> /**
> >>>>@@ -3067,13 +3060,9 @@ struct drm_i915_gem_request *
> >>>> i915_gem_request_alloc(struct intel_engine_cs *engine,
> >>>> struct i915_gem_context *ctx)
> >>>> {
> >>>>- struct drm_i915_gem_request *req;
> >>>>- int err;
> >>>>-
> >>>> if (ctx == NULL)
> >>>> ctx = engine->i915->kernel_context;
> >>>>- err = __i915_gem_request_alloc(engine, ctx, &req);
> >>>>- return err ? ERR_PTR(err) : req;
> >>>>+ return __i915_gem_request_alloc(engine, ctx);
> >>>> }
> >>>>
> >>>> struct drm_i915_gem_request *
> >>>>
> >>>
> >>>Looks good to me. And have this feeling I've seen this somewhere before.
> >>
> >>Several times. This is not the full tidy, nor does it realise the
> >>ramifactions of request alloc through the stack.
> >
> >Hm I can't spot that it is doing anything wrong or making anything
> >worse. You don't want to let the small cleanup in?
> >
> >Regards,
> >Tvrtko
>
> It ought to make almost no difference, because the *only* place the
> inner function is called is from the outer one, which passes a
> pointer to a local for the returned object; and the inner one is
> then inlined, so the compiler doesn't actually put it on the stack
> and call to the inner allocator anyway.
>
> Strangely, however, with this change the code becomes ~400 bytes bigger!
>
> Disassembly reveals that while the code for the externally-callable
> outer function is indeed almost identical, a second copy of it has
> also been inlined at the one callsite in this file:
>
> __i915_gem_object_sync() ...
> req = i915_gem_request_alloc(to, NULL);
>
> I don't think that's a critical path and would rather have 400 bytes
> smaller codespace. We can get that back by adding /noinline/ to the
> outer function i915_gem_request_alloc() (not, of course, to the
> inner one, that definitely *should* be inline).
__i915_gem_object_sync() should not be calling i915_gem_request_alloc().
That's the issue with this patch, your patch and John's patch.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list