[PATCH 0/9] Visible VRAM Management Improvements

Fri Jun 23 23:16:27 UTC 2017

On Fri, Jun 23, 2017 at 05:02:58PM -0400, Felix Kuehling wrote:
> Hi John,
> 
> I haven't read your patches. Just a question based on the cover letter.
> 
> I understand that visible VRAM is the biggest pain point. But could the
> same reasoning make sense for invisible VRAM? That is, doing all the
> migrations to VRAM in a workqueue?
> 
> Regards,
>   Felix
> 

I don't see why not. In theory, all non-essential buffer moves could be done
this way, and it would be relatively trivial to extend it to that.

But I wanted to limit the scope of my changes, at least for this series.
Testing takes a long time and I wanted to focus those testing efforts as much
as possible, produce something well-tested (I hope), and get feedback on this
limited application of the concept before expanding its reach.

John

> 
> On 17-06-23 01:39 PM, John Brooks wrote:
> > This patch series is intended to improve performance when limited CPU-visible
> > VRAM is under pressure.
> >
> > Moving BOs into visible VRAM is essentially a housekeeping task. It's faster to
> > access them in VRAM than GTT, but it isn't a hard requirement for them to be in
> > VRAM. As such, it is unnecessary to spend valuable time blocking on this in the
> > page fault handler or during command submission. Doing so translates directly
> > into a longer frame time (ergo stalls and stuttering).
> >
> > The problem worsens when attempting to move BOs into visible VRAM when it is
> > full. This takes much longer than a simple move because other BOs have to be
> > evicted, which involves finding and then moving potentially hundreds of other
> > BOs, which is very time consuming. In the case of limited visible VRAM, it's
> > important to do this sometime to keep the contents of visible VRAM fresh, but
> > it does not need to be a blocking operation. If visible VRAM is full, the BO
> > can be read from GTT in the meantime and the BO can be moved to VRAM later.
> >
> > Thus, I have made it so that neither the command submission code nor page fault
> > handler spends time evicting BOs from visible VRAM, and instead this is
> > deferred to a workqueue function that's queued when CS requests BOs flagged
> > AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED.
> >
> > Speaking of CPU_ACCESS_REQUIRED, I've changed the handling of that flag so that
> > the kernel driver can clear it later even if it was set by userspace. This is
> > because the userspace graphics library can't know whether the application
> > really needs it to be CPU_ACCESS_REQUIRED forever. The kernel driver can't know
> > either, but it does know when page faults occur, and if a BO doesn't appear to
> > have any page faults when it's moved somewhere inaccessible, the flag can be
> > removed and it doesn't have to take up space in CPU-visible memory anymore.
> > This change was based on IRC discussions with Michel.
> >
> > Patch 7 fixes a problem with BO moverate throttling that causes visible VRAM
> > moves to not be throttled if total VRAM isn't full enough.
> >
> > I've also added a vis_vramlimit module parameter for debugging purposes. It's
> > similar to the vramlimit parameter except it limits only visible VRAM.
> >
> > I have tested this patch set with the two games I know to be affected by
> > visible VRAM pressure: DiRT Rally and Dying Light. It practically eliminates
> > eviction-related stuttering in DiRT Rally as well as very low performance if
> > visible VRAM is limited to 64MB. It also fixes severely low framerates that
> > occurred in some areas of Dying Light. All my testing was done with an R9 290
> > with 4GB of visible VRAM with an Intel i7 4790.
> >
> > --
> > John Brooks (Frogging101)
> >
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>