[Nouveau] low memory

Xavier Chantry chantry.xavier at gmail.com
Tue Feb 9 11:49:45 PST 2010

12:08 < curro_> shining: hmm, it seems, darktama didn't quite finish
the additional reloc checking he started to code
12:11 < curro_> shining: that would have solved your problem, poke him
when he's back from vacations :)
12:16 < shining> curro_: hmm I really dont get it, it looks like
domain can have both set, and flags can also have both set
12:16 < shining> I want to look at the reloc checking, what made you
say he didnt finish ?
12:23 < curro_> shining: when you pin a BO, it can't end up in several
locations at the same time :P
12:23 < curro_> he implemented the necessary stuff to track available
aperture space from userspace
12:23 < curro_> but he didn't make the reloc functions check if the
buffers would actually fit

/me pokes darktama :)

Let me remind you my wonderful test case : loading a 3500x2500 pixmap
in firefox with 64mb vram.

After talking a bit more with curro, I started to write a patch. I
don't know how bad and wrong it is, there are still so many things I
don't understand.
It seems it works somehow, meaning OUT_RELOC -> emit_reloc will fail
before FIRE_RING -> pushbuf_flush.
But enomem failures during pushbuf_flush still happen. And worse, what
happens after an OUT_RELOC failure is awful :
1) on nv25, the system freezes for 5 seconds, and after the lower part
(a rectangle) of the picture seems to have a wrong offset or
2) on nv84 (hacked to force 64mb vram) : X crash because of a bug in
nouveau_wfb.c . After fixing that, the pixmap is correctly displayed
*after* the system freezes between 1min30 and 2min

(There are several options for fixing the imprecision bug of fast
divide in nouveau_wfb.c but I would like to be able to run this code
in a normal situation, without crazy system freezing and extreme
slowness, so that I can hopefully do proper benchmarking between the
different options :) )

I ran oprofile on nv25 in these two configurations :
1) previous workaround of making nouveau_exa_create_pixmap always fail
: performance still acceptable (early fallback)
2) runtime OUT_RELOC failure and fallback : turtle speed (late fallback)

The commit that implemented workaround 1 for 32mb vram says :
   exa: force the use of sysmem pixmaps on low-mem cards
   Very similar effect to forcing MigrationHeuristic "greedy" on classic
   EXA.  Far better than the migration ping-pong that'd occur otherwise

I suppose that arch/x86/mm/pageattr.c showing up in the profile, and
pixman_blt_mmx taking ages are consequences of that migration
ping-pong ?
But I still don't understand what is going on, what migrations are
made and how to limit them.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-check-memory-for-relocs.patch
Type: text/x-patch
Size: 3178 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/nouveau/attachments/20100209/5aafbd93/attachment-0001.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: early-fallback
Type: application/octet-stream
Size: 16688 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/nouveau/attachments/20100209/5aafbd93/attachment-0002.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: late-fallback
Type: application/octet-stream
Size: 16725 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/nouveau/attachments/20100209/5aafbd93/attachment-0003.obj 

More information about the Nouveau mailing list