[PATCH] fixes to drm-next - TTM DMA code (v1)

Konrad Rzeszutek Wilk konrad.wilk at oracle.com
Tue Dec 20 09:35:52 PST 2011

On Mon, Dec 19, 2011 at 08:51:15PM +0100, Thomas Hellstrom wrote:
> On 12/13/2011 05:40 PM, Konrad Rzeszutek Wilk wrote:
> >On Tue, Dec 13, 2011 at 05:23:30PM +0100, Thomas Hellstrom wrote:
> >>On 12/13/2011 05:07 PM, Jerome Glisse wrote:
> >>>On Mon, Dec 12, 2011 at 03:09:26PM -0500, Konrad Rzeszutek Wilk wrote:
> >>>>Jerome pointed me to some accounting error in the DMA API debugging code and
> >>>>while I can't figure it out yet, I did notice some extreme slowness - which
> >>>>is due to the nouveau driver calling the unpopulate (now that unbind +
> >>>>unpopulate are squashed) quite a lot (this is using Gnome Shell - I think GNOME2
> >>>>did not have those issues but I can't recall).
> >>>>
> >>>>Anyhow these patches fix the 50% perf regression I saw and also some minor bugs
> >>>>that I noticed.
> >>>>
> >>>Gonna review those today and test them.
> >>>
> >>>Cheers,
> >>>Jerome
> >>Hi!
> >>
> >>I'm not whether any drivers are still using the AGP backend?
> >Uh, probably they do if the cards are AGP?
> >The problem I encountered was with an PCIe Nvidia card:
> >
> >01:00.0 VGA compatible controller: nVidia Corporation G84 [GeForce 8600 GT] (rev a1
> >
> >>Calling unpopulate / (previous clear) each time unbind is done
> >>should be quite
> >>inefficient with that one, as AGP sets up its own data structures
> >>and copies page tables
> >>on each populate. That should really be avoided unless there is a
> >>good reason to have it.
> >nouveau_bo_rd32 and nv50_crtc_cursor_set showed up as the callers that
> >were causing the unpopulate calls. It did happen _a lot_ when I moved the
> >cursor madly.
> Konrad, Jerome
> Was there a resolution to this. If the ttm_tt rewrite results in
> unpopulate being called more often than before,
> and that causes performance regressions, that must be fixed as soon
> as possible.

I am waiting for Jerome to look over my patches. Next week I was
going to setup a test-bed with an Radeon and Nvidia AGP card
and bootup two kernels - one without the ttm_bind/ttm_populate
squashing and one with it.

Then run some benchmark code.. Thought not sure what kind of benchmark
code I should run? Any thoughts? The regression I observed were just by
moving the mouse around but that is not very "scientfic".

openareana? alien? etuxracer? Any of them offer some timed demo mode
to get an idea of performance impact?

More information about the dri-devel mailing list