[Mesa-dev] Gallium proposal: add a user pointer in pipe_resource

Mon Jan 31 13:09:33 PST 2011

On 31.01.2011 21:17, José Fonseca wrote:
> On Mon, 2011-01-31 at 11:48 -0800, Christoph Bumiller wrote:
>> On 31.01.2011 19:46, Marek Olšák wrote:
>>> With this manager, the drivers don't have to deal with user buffers when they are bound as vertex buffers. They only get real hardware buffers.
>> Please do *not* take away my user buffers and put user vertex arrays at the mercy of a state tracker !
>> In the DrawArrays case I usually use util/translate and interleave them letting it write directly into my command buffer for immediate mode vertex data submission.
> Christoph,
>
> Is there any reason for not wanting to the same optimization for
> non-user buffers?
>
For non-user buffers this is not an optimization.
Immediate mode (sending vertices through the command buffer) bypasses
the vertex cache, which is perfectly fine for user buffers which I
cannot cache anyway since they might change at any time.
Also, non-user buffers are already accessible by the GPU so VFETCH can
go right ahead and do all the work.

> If the buffers are small and used only once, wouldn't you still want to
> write them directly into the command buffer?
>
As I said, they're already in GPU accessible system memory or even VRAM,
no reason to let the CPU move the data around before letting the GPU
read it.
Unless you're suggesting to do lazy transfers / "real" buffer allocations ?

> Because eliminating user buffers does not imply eliminating these
> optimization opportunities -- the driver can still know how big a buffer
> is, and the state tracker can set a flag such as PIPE_USAGE_ONCE to help
> the pipe driver figure out this is a fire and forget buffer. Perhaps we
The case where user buffers + immediate mode are a real win right now is
when the application asks you to pull e.g. vertices 0, 16, 25, and 8999
from the user memory.
You do not know it will do that at transfer time, and if you write min
index to max index to your scratch buffer, you copy around 8996 vertices
too many instead of just extracting these 4 directly from the source and
putting them into the command buffer.

> can have a PIPE_CAP for distinguishing the drivers that can inline small
> buffers, vs those who can and prefer them batched up in big vbos.
>
> And lets not forget the user arrays are a deprecated feature of GL.
> Applications will have to create a vbo even if all they wanna do is a
> draw a textured quad, therefore small vbos are worthwhile to optimize
> regardless.
>
That's true, but doesn't automatically make all the old OpenGL
applications use VBOs. I still want those to run as fast as possible.
Gallium's task shouldn't be to patronize me wherever possible, even if
it really enjoys doing that from time to time.

> I'm not saying we must get rid of user buffers now, but I can't help
> feeling that it is odd that while recent versions of GL/DX APIs are
> eliminating index/vertex buffers in user memory, Gallium is optimizing
> for them...
>
Gallium is not. The pipe driver is optimizing the case where they are
practical.

Christoph.