[Nouveau] Constbuf uploads on G80 and GF100+

Mon Jun 29 14:26:56 PDT 2015

Hello,

It seems that NVIDIA GPUs, at least starting with G80, have an
optimized path for the sequence

draw;
update consts;
draw;
update consts;
etc

Whereby it will start processing draw2 before draw1 is done. To do
this, it appears there's some magic constbuf cache on the chip which
buffers the updates to the right draw, eventually serializing them all
out to memory as if it were all done serially.

In order to make it into this magic constbuf cache, there are special
constbuf upload entrypoints, on GF100 they are method 0x2390 and the
associated methods that come right before it.

However in order for it to all work out as one might hope, the CB
settings that were in place when the CB was bound (via method 0x2410)
have to match the ones used for upload, specifically the address. So
if you have a CB at address 0x1000 of size 0x1000, and you decide to
update its data at 0x800, it appears that you have to use that same
initial 0x1000 as the base and 0x800 as the offset. If you use an
address of 0x1800, it won't notice that the CB is bound.

This is easy enough to handle. But what do you do when some genius
wants to have two overlapping buffers, and updates the overlapping
area? For example

glBufferData(GL_UNIFORM_BUFFER, 0x1000)
glBindBufferRange(GL_UNIFORM_BUFFER, 1, buf, 0, 0x200);
glBindBufferRange(GL_UNIFORM_BUFFER, 2, buf, 0x100, 0x200);

and then try to do a glMapBufferRange(buf, 0x100 - 0x1000) or something.

Is there a way to handle it while playing nice with the CB update
buffer mechanism, or do you have to give up and do a serialize (method
0x110) followed by a memory barrier (0x21c)? Or do you just pick
whichever one you like, as long as any were bound and it's good?

Also, on G80-era GPUs the constbuf upload process is a bit different,
where it wants the uploads to go to a specific binding point. How
should the overlapping situation be handled there?

Thanks for any info on this!

  -ilia