[Intel-gfx] [PATCH 1/4] drm/i915: Clearing buffer objects via blitter engine
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Jul 2 02:30:43 PDT 2015
On 07/01/2015 05:30 PM, Chris Wilson wrote:
> On Wed, Jul 01, 2015 at 03:54:55PM +0100, Tvrtko Ursulin wrote:
>>> +static int i915_gem_exec_flush_object(struct drm_i915_gem_object *obj,
>>> + struct intel_engine_cs *ring,
>>> + struct intel_context *ctx,
>>> + struct drm_i915_gem_request **req)
>>> +{
>>> + int ret;
>>> +
>>> + ret = i915_gem_object_sync(obj, ring, req);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + if (obj->base.write_domain & I915_GEM_DOMAIN_CPU) {
>>> + if (i915_gem_clflush_object(obj, false))
>>> + i915_gem_chipset_flush(obj->base.dev);
>>> + obj->base.write_domain &= ~I915_GEM_DOMAIN_CPU;
>>> + }
>>> + if (obj->base.write_domain & I915_GEM_DOMAIN_GTT) {
>>> + wmb();
>>> + obj->base.write_domain &= ~I915_GEM_DOMAIN_GTT;
>>> + }
>>
>> All this could be replaced with i915_gem_object_set_to_gtt_domain, no?
>
> No. Technically this is i915_gem_execbuffer_move_to_gpu().
Aha.. I see now what was my confusion. It doesn't help that
i915_gem_execbuffer_move_to_gpu and execlist_move_to_gpu are implemented
at different places logically.
It would be nice to extract the loop body then call it something like
i915_gem_execbuffer_move_vma_to_gpu, it would avoid at least three
instances of the same code.
>>> +
>>> + return i915.enable_execlists ?
>>> + logical_ring_invalidate_all_caches(*req) :
>>> + intel_ring_invalidate_all_caches(*req);
>>
>> And this is done on actual submission for you by the lower levels so
>> you don't need to call it directly.
>
> What submission? We don't build a batch, we are building a raw request
> to do the operation from the ring.
I was confused to where execlist_move_to_gpu is in the stack.
>>> + lockdep_assert_held(&dev->struct_mutex);
>>
>> It think there was some guidance that lockdep_assert_held is
>> compiled out when lockdep is not in the kernel and that WARN_ON is
>> preferred. In this case that would probably be WARN_ON_ONCE and
>> return error.
>
> Hah, this predates that and I still disagree.
Predates or not is not relevant. :) It is not a clean cut situation I
agree. Maybe we need our own amalgamation on WARN_ON_ONCE and
lockdep_assert_held but I think we either check for these things or not,
or have a really good assurance of test coverage with lockdep enabled
during QA.
>>> + ring = &dev_priv->ring[HAS_BLT(dev) ? BCS : RCS];
>>> + ctx = i915_gem_context_get(file_priv, DEFAULT_CONTEXT_HANDLE);
>>> + /* Allocate a request for this operation nice and early. */
>>> + ret = i915_gem_request_alloc(ring, ctx, &req);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + if (ctx->ppgtt)
>>> + vm = &ctx->ppgtt->base;
>>> + else
>>> + vm = &dev_priv->gtt.base;
>>> +
>>> + if (i915.enable_execlists && !ctx->engine[ring->id].state) {
>>> + ret = intel_lr_context_deferred_create(ctx, ring);
>>
>> i915_gem_context_get above and this call are very similar to what
>> i915_gem_validate_context does. It seems to me it would be better to
>> call the latter function here.
>
> No, the intel_lrc API is absolute garbage and needs to be taken out the
> back and shot. Until that is done, I wouldn't bother continuing to try
> and use the interface at all.
>
> All that needs to happen here is:
>
> req = i915_gem_request_alloc(ring, ring->default_context);
>
> and for the request/lrc to go off and dtrt.
Well.. I the meantime why duplicate it when i915_gem_validate_context
does i915_gem_context_get and deferred create if needed. I don't see the
downside. Snippet from above becomes:
ring = &dev_priv->ring[HAS_BLT(dev) ? BCS : RCS];
ctx = i915_gem_validate_context(dev, file, ring,
DFAULT_CONTEXT_HANDLE);
... handle error...
/* Allocate a request for this operation nice and early. */
ret = i915_gem_request_alloc(ring, ctx, &req);
Why would this code have to know about deferred create.
>>> + }
>>> +
>>> + ringbuf = ctx->engine[ring->id].ringbuf;
>>> +
>>> + ret = i915_gem_object_pin(obj, vm, PAGE_SIZE, 0);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + if (obj->tiling_mode && INTEL_INFO(dev)->gen <= 3) {
>>> + ret = i915_gem_object_put_fence(obj);
>>> + if (ret)
>>> + goto unpin;
>>> + }
>>
>> Why is this needed?
>
> Because it's a requirement of the op being used on those gen. Later gen
> can keep the fence.
>
>> Could it be called unconditionally and still work?
>>
>> At least I would recommend a comment explaining it.
It is ugly to sprinkle platform knowledge to the callers - I think I saw
a callsites which call i915_gem_object_put_fence unconditionally so why
would that not work here?
>>> + if (i915.enable_execlists) {
>>> + if (dev_priv->info.gen >= 8) {
>>> + ret = intel_logical_ring_begin(req, 8);
>>> + if (ret)
>>> + goto unpin;
>>> +
>>> + intel_logical_ring_emit(ringbuf, GEN8_COLOR_BLT_CMD |
>>> + BLT_WRITE_RGBA |
>>> + (7-2));
>>> + intel_logical_ring_emit(ringbuf, BPP_32 |
>>> + ROP_FILL_COPY |
>>> + PAGE_SIZE);
>>> + intel_logical_ring_emit(ringbuf, 0);
>>> + intel_logical_ring_emit(ringbuf,
>>> + obj->base.size >> PAGE_SHIFT
>>> + << 16 | PAGE_SIZE / 4);
>>> + intel_logical_ring_emit(ringbuf,
>>> + i915_gem_obj_offset(obj, vm));
>>> + intel_logical_ring_emit(ringbuf, 0);
>>> + intel_logical_ring_emit(ringbuf, 0);
>>> + intel_logical_ring_emit(ringbuf, MI_NOOP);
>>> +
>>> + intel_logical_ring_advance(ringbuf);
>>> + } else {
>>> + DRM_ERROR("Execlists not supported for gen %d\n",
>>> + dev_priv->info.gen);
>>> + ret = -EINVAL;
>>
>> I would put a WARN_ON_ONCE here, or even just return -EINVAL. If the
>> driver is so messed up in general that execlists are enabled < gen8
>> I think there is no point logging errors about it from here. Would
>> also save you one indentation level.
>
> I would just rewrite this to have a logical interface to the rings. Oh
> wait, I did.
That is out of my jurisdiction, but I think my comment to the above is
not an unreasonable one since it indicates total driver confusion and
could/should be handled somewhere else.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list