[Intel-gfx] [PATCH 10/18] drm/i915: Unify legacy/execlists emission of MI_BATCHBUFFER_START

Dave Gordon david.s.gordon at intel.com
Wed Jul 27 15:04:44 UTC 2016


On 21/07/16 15:14, Chris Wilson wrote:
> On Thu, Jul 21, 2016 at 04:39:58PM +0300, Joonas Lahtinen wrote:
>> On ke, 2016-07-20 at 14:12 +0100, Chris Wilson wrote:
>>>  	if (so.aux_batch_size > 8) {
>>> -		ret = req->engine->dispatch_execbuffer(req,
>>> -						     (so.ggtt_offset +
>>> -						      so.aux_batch_offset),
>>> -						     so.aux_batch_size,
>>> -						     I915_DISPATCH_SECURE);
>>> +		ret = req->engine->emit_bb_start(req,
>>> +						 (so.ggtt_offset +
>>> +						  so.aux_batch_offset),
>>> +						 so.aux_batch_size,
>>> +						 I915_DISPATCH_SECURE);
>>>  		if (ret)
>>>  			goto out;
>>>  	}
>>
>> The code above this line is exact reason why I don't like the a->b->c
>> (especially when there is repetition). But it's not new to this patch
>> so guess it'll do. Some future work to shorten down a little bit might
>> not hurt.
>
> I presume you mean req->engine->x here, not so.y. Is it just the depth
> and saving 5 columns? Or something else?
> -Chris

It can also hurt code efficiency. For example in

int i915_gem_render_state_init(struct drm_i915_gem_request *req)
{
     ret = i915_gem_render_state_prepare(req->engine, &so);
     ...
     ret = req->engine->dispatch_execbuffer(req, so.ggtt_offset,
                 so.rodata->batch_items * 4, I915_DISPATCH_SECURE);
     ...
     if (so.aux_batch_size > 8) {
         ret = req->engine->dispatch_execbuffer(req,
                 (so.ggtt_offset + so.aux_batch_offset),
                 so.aux_batch_size, I915_DISPATCH_SECURE);
                 if (ret)
                         goto out;
         }
     ...
}

The compiler may not -- and in this case, *is not allowed to* -- assume 
that req->engine is the same at each callsite. We have passed the 
non-const pointer "req" to callees, so it must assume that req->engine 
may have been changed and re-fetch it from memory.

By inspection of the generated code, we find that a local for a value 
that is used only once is a net loss, twice is breakeven, and three or 
more times is a definite win.

And it generally makes the code prettier too :)

.Dave.


More information about the Intel-gfx mailing list