[Intel-gfx] [PATCH 1/2] drm/i915: Cache elsp submit register
david.s.gordon at intel.com
Wed Mar 30 15:05:50 UTC 2016
On 22/03/16 17:39, Tvrtko Ursulin wrote:
> On 22/03/16 17:29, Ville Syrjälä wrote:
>> On Tue, Mar 22, 2016 at 05:16:52PM +0000, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Since we write four times to the same register, caching
>>> the mmio register saves a tiny amount of generated code.
>> The compiler can't figure this out on its own?
> Nope, at least gcc 4.84 I am running here can't. :(
> And this only solves one part of the things it can't figure out in that
> code. It still recalculates one part, can't remember which one is which
> now without revisiting the generated assembly. It used to be for times
> in a row: load register, add 0x230, displace 0x78, store[0-4]. This only
> solves the add 0x230 redundancy. But working around that would possibly
> be a bit too low level.
Compiler's probably assuming aliasing.
RING_ELSP(engine) is actually (engine->mmio_base+0x230).
I915_WRITE_FW(reg, val) is actually __raw_i915_write32(dev_priv,
(reg__), (val__)) which ultimately translates to a store to some address.
The compiler can't be sure that this store isn't actually to
(engine->mmio_base), so it refetches it and adds the 0x230 again. Saving
the (struct-valued) result of the RING_ELSP() macro means the compiler
knows it isn't aliased, so can reuse it four times.
We could try adding __restrict to various key pointers, starting with
dev_priv and all pointers-to-engines?
More information about the Intel-gfx