No subject

Mon Jan 31 10:13:25 PST 2011

etc. at the gallium boundary, it&#39;s preferable if all actions are 
interposable - ie. all actions are mediated by a function call of some 
sort into the gallium interface. =C2=A0Giving a component a direct memory<b=
r>
access into buffer contents would tend to defeat that and make 
record/replay of that action difficult. </blockquote><div> Indeed, re=
cord/replay would be difficult but not impossbie. FWIW I think the interfac=
e shouldn&#39;t be specifically designed for record/replay. Instead, record=
/replay should be made work with whatever interface there is.

</div><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0p=
t 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
 
Is it possible to get a description of what you&#39;re doing at a slightly<=
br>
higher level to try and understand if there&#39;s a solution without these<=
br>
drawbacks? </blockquote></div> 
I am quite content with the current interface for user buffers, but it=20
will need a few changes for it to be more... efficient. Below is my plan
 for further improving Gallium and its interactions with user buffers in
 general.

1) New vertex buffer manager 
 
This is how I&#39;d like to put the=20
burden of uploading user buffers out of the drivers. I&#39;ve made a new=20
vertex buffer manager. It can be found here: 
<a href=3D"http://cgit.freedesktop.org/~mareko/mesa/commit/?h=3Dvbuf-mgr&am=
p;id=3D94a53b672dd238e6a50bb6b252614dc2e9f30ddf">http://cgit.freedesktop.or=
g/~mareko/mesa/commit/?h=3Dvbuf-mgr&amp;id=3D94a53b672dd238e6a50bb6b252614d=
c2e9f30ddf</a>

<br>
And the corresponding branch is here:<br>
<a href=3D"http://cgit.freedesktop.org/~mareko/mesa/log/?h=3Dvbuf-mgr">http=
://cgit.freedesktop.org/~mareko/mesa/log/?h=3Dvbuf-mgr</a><br>
<br>It&#39;s a module that drivers can use and it does 2 things:<br>
- uploads user buffers<br>
-
 takes care of converting unsupported vertex formats and unaligned=20
vertex layouts to supported ones (there are vertex fetcher capability=20
bits, see struct u_vbuf_caps)<br>
<br>
Besides some typos in a few commits, this work is already done.<br>
<br>
With
 this manager, the drivers don&#39;t have to deal with user buffers when=20
they are bound as vertex buffers. They only get real hardware buffers.=20
The drivers don&#39;t even have to deal with unsupported vertex formats.=20
Moreover, this code has already proven to be fast when it was maturing=20
in r300g and is specifically optimized for low overhead. Its integration
 to a driver is easy and straithforward. But I need the=20
pipe_resource::user_ptr variable to be able to access the user buffer=20
memory in util (efficiently).<br>
<br>
<br>

2) Optimizing vertex array state changes in st/mesa 
 
Because we=20
don&#39;t need to know the sizes of user vertex buffers (as I said, we can=
=20
use pipe_draw_info::min_index and max_index instead, as is done in the=20
vertex buffer manager), we can remove the calculation of the buffer=20
sizes from st_draw_vbo. We can also remove pipe_vertex_buffer::max_index
 (again, the information provided by pipe_draw_info is sufficient) and=20
the related code from st/mesa. Not only will this simplify st_draw_vbo,=20
it will allow us to bind vertex buffers and a vertex elements state only
 when needed, i.e. when either the _NEW_ARRAY or _NEW_PROGRAM flag is set. =
It=20
makes this usage case a lot faster:

<br>

for (i =3D 0; i &lt; 1000; i++) glDrawElements(...);<br>

<br>

This work is *almost* complete and can be found here:<br>

<a href=3D"http://cgit.freedesktop.org/~mareko/mesa/log/?h=3Dgallium-varray=
s-optim">http://cgit.freedesktop.org/~mareko/mesa/log/?h=3Dgallium-varrays-=
optim</a><br>

<br>

The framerate in the Torcs game goes from 14 to 20 fps with this code (in o=
ne particular scenario I was optimizing for).<br>

<br>

Since I no longer compute the sizes of user buffers, I have to put ~0 in=20
the &#39;size&#39; parameter of user_buffer_create and I expect drivers not=
 to=20
use it. r300g, r600g, nv50, nvc0, softpipe, and llvmpipe should be ready
 for this. Not sure about the other drivers, but it&#39;s fixable.

<br>

--<br>

So this all is my current plan to simplify hardware drivers a bit and=20
add some nice optimizations. Another option would be to move the new=20
vertex buffer manager or something equivalent to the state tracker and remo=
ve user buffers from=20
the Gallium interface, but that would be additional work with uncertain per=
formance=20
implications, so I decided not to take this path (for now).

<br>Best regards<br>Marek<br>

--001636283790385192049b28dcf7--