[Intel-gfx] [PATCH v5 1/6] drm/i915/gen8: Add infrastructure to initialize WA batch buffers
Chris Wilson
chris at chris-wilson.co.uk
Fri Jun 19 02:27:19 PDT 2015
On Thu, Jun 18, 2015 at 06:33:24PM +0100, Arun Siluvery wrote:
Totally minor worries now.
> +/**
> + * gen8_init_indirectctx_bb() - initialize indirect ctx batch with WA
> + *
> + * @ring: only applicable for RCS
> + * @wa_ctx_batch: page in which WA are loaded
> + * @offset: This is for future use in case if we would like to have multiple
> + * batches at different offsets and select them based on a criteria.
> + * @num_dwords: The number of WA applied are known at the beginning, it returns
> + * the no of DWORDS written. This batch does not contain MI_BATCH_BUFFER_END
> + * so it adds padding to make it cacheline aligned. MI_BATCH_BUFFER_END will be
> + * added to perctx batch and both of them together makes a complete batch buffer.
> + *
> + * Return: non-zero if we exceed the PAGE_SIZE limit.
> + */
> +
> +static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring,
> + uint32_t **wa_ctx_batch,
> + uint32_t offset,
> + uint32_t *num_dwords)
> +{
> + uint32_t index;
> + uint32_t *batch = *wa_ctx_batch;
> +
> + index = offset;
I worry that offset need not be cacheline aligned on entry (for example
if indirectctx and perctx were switched, or someone else stuffed more
controls into the per-ring object). Like perctx, there is no mention of
any alignment worries for the starting location, but here you tell use
that the INDIRECT_CTX length is specified in cacheline, so I also presume
the start needs to be aligned.
> +static int lrc_setup_wa_ctx_obj(struct intel_engine_cs *ring, u32 size)
> +{
> + int ret;
> +
> + WARN_ON(ring->id != RCS);
> +
> + ring->wa_ctx.obj = i915_gem_alloc_object(ring->dev, PAGE_ALIGN(size));
> + if (!ring->wa_ctx.obj) {
> + DRM_DEBUG_DRIVER("alloc LRC WA ctx backing obj failed.\n");
> + return -ENOMEM;
> + }
> +
> + ret = i915_gem_obj_ggtt_pin(ring->wa_ctx.obj, PAGE_SIZE, 0);
One day _pin() will return the vma being pinned and I will rejoice as it
makes reviewing pinning much easier! Not a problem for you right now.
> +static int intel_init_workaround_bb(struct intel_engine_cs *ring)
> +{
> + int ret = 0;
> + uint32_t *batch;
> + uint32_t num_dwords;
> + struct page *page;
> + struct i915_ctx_workarounds *wa_ctx = &ring->wa_ctx;
> +
> + WARN_ON(ring->id != RCS);
> +
> + if (ring->scratch.obj == NULL) {
> + DRM_ERROR("scratch page not allocated for %s\n", ring->name);
> + return -EINVAL;
> + }
I haven't found the dependence upon scratch.obj, could you explain it?
Does it appear later?
> @@ -1754,15 +1934,26 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
> reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
> reg_state[CTX_SECOND_BB_STATE+1] = 0;
> if (ring->id == RCS) {
> - /* TODO: according to BSpec, the register state context
> - * for CHV does not have these. OTOH, these registers do
> - * exist in CHV. I'm waiting for a clarification */
> reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
> reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
> reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
> reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
> reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
> reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
> + if (ring->wa_ctx.obj) {
> + reg_state[CTX_RCS_INDIRECT_CTX+1] =
> + (i915_gem_obj_ggtt_offset(ring->wa_ctx.obj) +
> + ring->wa_ctx.indctx_batch_offset * sizeof(uint32_t)) |
> + (ring->wa_ctx.indctx_batch_size / CACHELINE_DWORDS);
Ok, this really does impose alignment conditions on indctx_batch_offset
Oh, if I do get a chance to complain, spell out indirect_ctx, make it a
struct or even just precalculate the reg value, just indctx's only value
is that is the same length as perctx, but otherwise quite obtuse.
Other than that, I couldn't punch any holes in its robustness, and the
series is starting to look quite good and very neat.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list