[PATCH v3] drm/i915: Allocate intel_engine_cs structure only for the enabled engines

Chris Wilson chris at chris-wilson.co.uk
Wed Jul 27 15:34:58 UTC 2016


On Wed, Jul 27, 2016 at 05:05:27PM +0530, Goel, Akash wrote:
> 
> 
> On 7/27/2016 3:23 PM, Chris Wilson wrote:
> >On Wed, Jul 27, 2016 at 03:27:35PM +0530, akash.goel at intel.com wrote:
> >>From: Akash Goel <akash.goel at intel.com>
> >>
> >>With the possibility of addition of many more number of rings in future,
> >>the drm_i915_private structure could bloat as an array, of type
> >>intel_engine_cs, is embedded inside it.
> >>	struct intel_engine_cs engine[I915_NUM_ENGINES];
> >>Though this is still fine as generally there is only a single instance of
> >>drm_i915_private structure used, but not all of the possible rings would be
> >>enabled or active on most of the platforms. Some memory can be saved by
> >>allocating intel_engine_cs structure only for the enabled/active engines.
> >>Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
> >>indexed using the enums defined in intel_engine_id.
> >>To save memory and continue using the static engine/ring IDs, 'engine' is
> >>defined as an array of pointers.
> >>	struct intel_engine_cs *engine[I915_NUM_ENGINES];
> >>dev_priv->engine[engine_ID] will be NULL for disabled engine instances.
> >>
> >>v2:
> >>- Remove the engine iterator field added in drm_i915_private structure,
> >>  instead pass a local iterator variable to the for_each_engine**
> >>  macros. (Chris)
> >>- Do away with intel_engine_initialized() and instead directly use the
> >>  NULL pointer check on engine pointer. (Chris)
> >>
> >>v3:
> >>- Remove for_each_engine_id() macro, as the updated macro for_each_engine()
> >>  can be used in place of it. (Chris)
> >>- Protect the access to Render engine Fault register with a NULL check, as
> >>  engine specific init is done later in Driver load sequence.
> >
> >What's the impact upon code size? (Always nice to know as the oppposite
> >change was intended to reduce code size.)
> >
> Sorry do you want to check the impact of these modifications on the
> overall code size (i915.o)
> 
> Would there be additional load instructions now ?
> 
> Earlier from the base address of dev_priv structure, the address of
> intel_engine_cs fields can derived as they were at a fixed offset.
> But now an additional load will be required.
> 
> Before,
> 1109804   24212     416 1134432  114f60 ./drivers/gpu/drm/i915/i915.o
> 
> After,
> 1108796   24212     416 1133424  114b70 ./drivers/gpu/drm/i915/i915.o

Wow. That merits futher investigation, that's a nice improvement. I was
expecting an impact from the extra layer of indirection, but I guess
that is more than made up in the loop logic. It certainly gives an extra
justification for the patch: reduced memory usage, reduced code size.
 
> >> 	if (invalidate_domains & I915_GEM_GPU_DOMAINS) {
> >> 		cmd |= MI_INVALIDATE_TLB;
> >>-		if (engine == &dev_priv->engine[VCS])
> >>+		if (engine == dev_priv->engine[VCS])
> >> 			cmd |= MI_INVALIDATE_BSD;
> >> 	}
> >
> >This style just needs to die. if (engine->id == VCS)
> 
> But this cleanup is to be done in future & not in this patch ?

They are in patches elsewhere.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx-trybot mailing list