[Mesa-dev] Gallium proposal: add a user pointer in pipe_resource

Tue Feb 1 16:28:07 PST 2011

On 01.02.2011 18:55, Keith Whitwell wrote:
>> In theory, doing user buffer uploads at the state tracker side using
>> inline transfers might work and should remove some burden from
>> drivers. 
> This would be an alternate approach -- the state-tracker could itself
> figure out min/max_index, and upload that data into a real hardware
> buffer -- basically the same task that the driver is doing in both
> examples above.
>> In practice, inline transfers may have a very large overhead compared
>> to how things work now. In transfer_inline_write, you usually want to
>> map the buffer, do a memcpy, and unmap it. The map/unmap overhead can
>> be really significant. There are applications that use glDrawElements
>> to draw one triangle at a time, and they draw hundreds of triangles
>> with user buffers in this way (yes, applications really do this). We
>> can't afford doing any more work than is absolutely necessary. When
>> you get 10000 or more draw_vbo calls per second, everything matters.
>>
>> Currently, the radeon drivers have one upload buffer for vertices and
>> it stays mapped until the command stream is flushed. When they get a
>> user buffer, they do one memcpy and that's all. They don't touch
>> winsys unless the upload buffer is full.
> So the optimization we're really talking about here is saving the
> map/unmap overhead on the upload buffer?
>
> And if the state tracker could do the uploads without incurring the
> map/unmap overhead, would that be sufficient for you to feel comfortable
> moving this functionality up a level?
In practice there are situations where (max_index - min_index) >>
vertex_count and there it's really practical that I can just pull the
few vertices directly from user memory and put them into my command stream.

Constant vertex attributes use stride 0 user buffers and since they have
a specialized path I do not want real buffer allocations for them.

GL uniforms / constants also get uploaded with a dedicated method and,
in case of nvfx, it's shader dependent. Removal of user buffers adds
back the memcpy that was recently removed.

Sorry, I don't trust the state tracker to become smart enough to do the
best thing for my hardware.