[Intel-gfx] [PATCH v5 1/6] drm/i915/gen8: Add infrastructure to initialize WA batch buffers

Siluvery, Arun arun.siluvery at linux.intel.com
Fri Jun 19 02:48:24 PDT 2015


On 19/06/2015 10:27, Chris Wilson wrote:
> On Thu, Jun 18, 2015 at 06:33:24PM +0100, Arun Siluvery wrote:
> Totally minor worries now.
>
>> +/**
>> + * gen8_init_indirectctx_bb() - initialize indirect ctx batch with WA
>> + *
>> + * @ring: only applicable for RCS
>> + * @wa_ctx_batch: page in which WA are loaded
>> + * @offset: This is for future use in case if we would like to have multiple
>> + *          batches at different offsets and select them based on a criteria.
>> + * @num_dwords: The number of WA applied are known at the beginning, it returns
>> + * the no of DWORDS written. This batch does not contain MI_BATCH_BUFFER_END
>> + * so it adds padding to make it cacheline aligned. MI_BATCH_BUFFER_END will be
>> + * added to perctx batch and both of them together makes a complete batch buffer.
>> + *
>> + * Return: non-zero if we exceed the PAGE_SIZE limit.
>> + */
>> +
>> +static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring,
>> +				    uint32_t **wa_ctx_batch,
>> +				    uint32_t offset,
>> +				    uint32_t *num_dwords)
>> +{
>> +	uint32_t index;
>> +	uint32_t *batch = *wa_ctx_batch;
>> +
>> +	index = offset;
>
> I worry that offset need not be cacheline aligned on entry (for example
> if indirectctx and perctx were switched, or someone else stuffed more
> controls into the per-ring object). Like perctx, there is no mention of
> any alignment worries for the starting location, but here you tell use
> that the INDIRECT_CTX length is specified in cacheline, so I also presume
> the start needs to be aligned.
>
offset need to be cachealigned, I will update the comments.

>> +static int lrc_setup_wa_ctx_obj(struct intel_engine_cs *ring, u32 size)
>> +{
>> +	int ret;
>> +
>> +	WARN_ON(ring->id != RCS);
>> +
>> +	ring->wa_ctx.obj = i915_gem_alloc_object(ring->dev, PAGE_ALIGN(size));
>> +	if (!ring->wa_ctx.obj) {
>> +		DRM_DEBUG_DRIVER("alloc LRC WA ctx backing obj failed.\n");
>> +		return -ENOMEM;
>> +	}
>> +
>> +	ret = i915_gem_obj_ggtt_pin(ring->wa_ctx.obj, PAGE_SIZE, 0);
>
> One day _pin() will return the vma being pinned and I will rejoice as it
> makes reviewing pinning much easier! Not a problem for you right now.
>
>> +static int intel_init_workaround_bb(struct intel_engine_cs *ring)
>> +{
>> +	int ret = 0;
>> +	uint32_t *batch;
>> +	uint32_t num_dwords;
>> +	struct page *page;
>> +	struct i915_ctx_workarounds *wa_ctx = &ring->wa_ctx;
>> +
>> +	WARN_ON(ring->id != RCS);
>> +
>> +	if (ring->scratch.obj == NULL) {
>> +		DRM_ERROR("scratch page not allocated for %s\n", ring->name);
>> +		return -EINVAL;
>> +	}
>
> I haven't found the dependence upon scratch.obj, could you explain it?
> Does it appear later?
>
yes it does in patch 2 which rearranges init_pipe_control().
I will move this check to that patch as per your comment.

>> @@ -1754,15 +1934,26 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
>>   	reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
>>   	reg_state[CTX_SECOND_BB_STATE+1] = 0;
>>   	if (ring->id == RCS) {
>> -		/* TODO: according to BSpec, the register state context
>> -		 * for CHV does not have these. OTOH, these registers do
>> -		 * exist in CHV. I'm waiting for a clarification */
>>   		reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
>>   		reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
>>   		reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
>>   		reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
>>   		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
>>   		reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
>> +		if (ring->wa_ctx.obj) {
>> +			reg_state[CTX_RCS_INDIRECT_CTX+1] =
>> +				(i915_gem_obj_ggtt_offset(ring->wa_ctx.obj) +
>> +				 ring->wa_ctx.indctx_batch_offset * sizeof(uint32_t)) |
>> +				(ring->wa_ctx.indctx_batch_size / CACHELINE_DWORDS);
>
> Ok, this really does impose alignment conditions on indctx_batch_offset
>
> Oh, if I do get a chance to complain, spell out indirect_ctx, make it a
> struct or even just precalculate the reg value, just indctx's only value
> is that is the same length as perctx, but otherwise quite obtuse.
>
variable names were getting too long and caused difficulties in 
indentation so tried to shorten them, will change this part.

regards
Arun

> Other than that, I couldn't punch any holes in its robustness, and the
> series is starting to look quite good and very neat.
> -Chris
>



More information about the Intel-gfx mailing list