[Intel-gfx] [PATCH 03/23] drm/i915: Remove lrc default desc from GEM context
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Aug 1 15:29:53 UTC 2019
On 01/08/2019 12:13, Chris Wilson wrote:
> Quoting Chris Wilson (2019-08-01 11:57:06)
>> Quoting Tvrtko Ursulin (2019-08-01 09:53:15)
>>> We could store it in ce then. We already have well defined control
>>> points for when vm changes when all are updated.
>>
>> We are storing it in ce; it's not like we recompute it all that often,
>> and when we do it's because we have rebound the vma.
>>
>>> If done like this then it looks like assigning ctx->hw_id could also do
>>> the default_desc update, so that we can avoid even more work done at pin
>>> time.
>>
>> What ctx->hw_id? You are imagining things again :-p
>>
>> Remember that we only do this on first pin from idle, not every pin.
>
> Fwiw, I quickly looked at only doing it if the vma is rebound, but
> that's move branches just to save a couple. The low frequency at which
> we have to actually compute this (walk a few more branches inside an
> already branchy contxt_pin) doesn't seem to justify the extra storage for
> me. It's not like we are recomputing lrc_desc on every submit as it once
> was.
On every submit if last request got retired in the meantime, no, for
instance bursty loads? Yeah it is very inconsequential but at some point
we made an effort to cache as much as possible what is invariant so it
saddens me a bit to remove that.
For instance Icelake engine dependent stuff sneaked into
intel_lrc.c/lrc_desriptors at some point, which is also against the
spirit of caching. If we were to move the cached value in ce then we
would be able to remove that and have it once again minimal in there.
Not only just minimal, but not separated in two separate places. I guess
this patch improves things in that respect - consolidates the lrc_desc
computation once again.
I did not get the part about VMA re-binding. I did not suggest to move
the lrca offset into cache as well. I was just thinking about the gen,
engine and vm dependent bits could naturally go into
i915_gem_context.c/default_desc_template. Just need to take (engine,
hw_id, vm).
And virtual engine would have to re-compute it when moving engines. Hm..
we don't seem to do that? Only when pinning we set it up based on
sibling[0] so how it all works? We don't re-pin when moving engine I
thought.
Aside that, if you are still not convinced my argument makes sense, you
can have my ack.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list