[Intel-gfx] [PATCH 2/2] drm/i915: Keep the per-object list of VMAs under control

Mon Feb 1 13:29:16 UTC 2016

On 01/02/16 11:12, Chris Wilson wrote:
> On Mon, Feb 01, 2016 at 11:00:08AM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> Where objects are shared across contexts and heavy rendering
>> is in progress, execlist retired request queue will grow
>> unbound until the GPU is idle enough for the retire worker
>> to run and call intel_execlists_retire_requests.
>>
>> With some workloads, like for example gem_close_race, that
>> never happens causing the shared object VMA list to grow to
>> epic proportions, and in turn causes retirement call sites to
>> spend linearly more and more time walking the obj->vma_list.
>>
>> End result is the above mentioned test case taking ten minutes
>> to complete and using up more than a GiB of RAM just for the VMA
>> objects.
>>
>> If we instead trigger the execlist house keeping a bit more
>> often, obj->vma_list will be kept in check by the virtue of
>> context cleanup running and zapping the inactive VMAs.
>>
>> This makes the test case an order of magnitude faster and brings
>> memory use back to normal.
>>
>> This also makes the code more self-contained since the
>> intel_execlists_retire_requests call-site is now in a more
>> appropriate place and implementation leakage is somewhat
>> reduced.
>
> However, this then causes a perf regression since we unpin the contexts
> too frequently and do not have any mitigation in place yet.

I suppose it is possible. What takes most time - page table clears on 
VMA unbinds? It is just that this looks so bad at the moment. :( Luckily 
it is just the IGT..

Regards,

Tvrtko