[Intel-gfx] [PATCH 1/2] drm/i915: Cache elsp submit register

Dave Gordon david.s.gordon at intel.com
Wed Mar 30 15:05:50 UTC 2016

On 22/03/16 17:39, Tvrtko Ursulin wrote:
> On 22/03/16 17:29, Ville Syrjälä wrote:
>> On Tue, Mar 22, 2016 at 05:16:52PM +0000, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Since we write four times to the same register, caching
>>> the mmio register saves a tiny amount of generated code.
>> The compiler can't figure this out on its own?
> Nope, at least gcc 4.84 I am running here can't. :(
> And this only solves one part of the things it can't figure out in that
> code. It still recalculates one part, can't remember which one is which
> now without revisiting the generated assembly. It used to be for times
> in a row: load register, add 0x230, displace 0x78, store[0-4]. This only
> solves the add 0x230 redundancy. But working around that would possibly
> be a bit too low level.
> Regards,
> Tvrtko

Compiler's probably assuming aliasing.

RING_ELSP(engine) is actually (engine->mmio_base+0x230).

I915_WRITE_FW(reg, val) is actually __raw_i915_write32(dev_priv, 
(reg__), (val__)) which ultimately translates to a store to some address.

The compiler can't be sure that this store isn't actually to 
(engine->mmio_base), so it refetches it and adds the 0x230 again. Saving 
the (struct-valued) result of the RING_ELSP() macro means the compiler 
knows it isn't aliased, so can reuse it four times.

We could try adding __restrict to various key pointers, starting with 
dev_priv and all pointers-to-engines?


More information about the Intel-gfx mailing list