[Mesa-dev] draw: Replace varray and vcache by vsplit
olvaffe at gmail.com
Fri Aug 13 08:09:22 PDT 2010
On Fri, Aug 13, 2010 at 10:51 PM, Keith Whitwell <keithw at vmware.com> wrote:
> On Fri, 2010-08-13 at 07:46 -0700, Chia-I Wu wrote:
>> On Fri, Aug 13, 2010 at 10:14 PM, Keith Whitwell <keithw at vmware.com> wrote:
>> > On Fri, 2010-08-13 at 07:04 -0700, Chia-I Wu wrote:
>> >> Hi,
>> >> There are two primitive transformations in gallium draw module. In
>> >> varray, primitives are "split"ted. When a primitive has more vertices
>> >> than the middle end can handle, varray splits the primitive and calls
>> >> the middle end multiple times.
>> >> In vcache, primitives are "decompose"d. More advanced primitives are
>> >> decomposed into one of point, line(_adj), or triangle(_adj).
>> >> Similarly, vcache may call the middle end multiple times to flush its
>> >> internal buffer. In some cases, vcache passes the primitves through
>> >> without decomposing nor splitting, as can be seen in vcache_check_run.
>> >> The issue with vcache is that it has to decompose a primitive
>> >> differently depending on the provoking convention, as explained in
>> >> http://lists.freedesktop.org/archives/mesa-dev/2010-August/001797.html
>> >> It becomes a problem when GS is active.
>> >> My proposal is to make vcache split instead of decompose. Because
>> >> varray only splits and vcache has a pass-through path, the rest of the
>> >> workflow already has to support all primitive types. Switching from
>> >> decompose to split does not require a big change to the rest of the
>> >> workflow.
>> >> But then vcache will look a lot like varray, only with indexed
>> >> primitive support. It leads me to a new frontend that replaces both
>> >> varray and vcache: vsplit
>> >> http://cgit.freedesktop.org/~olv/mesa/log/?h=draw-vsplit
>> >> vsplit is based on varray. It uses some code from vcache to support
>> >> indexed primitives. When vcache decomposes, there are flags being set
>> >> to indicate that if the stipple counter should be reset or if some
>> >> edge of a triangle should be omitted in unfilled mode. The segments
>> >> of a splitted primitive have flags for similar purposes too:
>> >> DRAW_SPLIT_AFTER More segments to come after this one
>> >> DRAW_SPLIT_BEFORE There are preceding segments
>> >> These flags are set by vsplit and the middle ends pass them to the
>> >> other stages. Therefore, the run methods of middle ends are augmented
>> >> to take the flags.
>> >> To summarize, vsplit
>> >> - fixes GS when (flatshade && flatshade_first) is on
>> >> - never sends more vertices than the middle end claims to handle
>> >> - is faster than vcache: split instead of decompose, no get_elt
>> >> calls
>> >> - no longer uses the higher bits of draw_elts for stipple/edge flags
>> >> Suggestions?
>> > Hi - I haven't looked at the patches yet, but a couple of questions:
>> > How does this interact with the draw_pipe_* code - which requires
>> > decomposed primitives?
>> draw_pipe.c decomposes the primitives. It is there before because it
>> has to support varray and vcache_check_run which do not decompose.
>> > How does this cope with indexed rendering where the vertex buffers
>> > themselves are too large (for hardware or some other entity)? Eg.
>> > imagine the hardware could cope with up to 64k vertices, and you have a
>> > drawelements call randomly referencing vertices in range 0..128k ?
>> Vertex fetching happens in the middle end so the range of the indices
>> is not a problem. Though vsplit guarantees that it never calls the
>> middle end with more vertices than the middle end claims to support
>> (as returned by draw_pt_middle_end::prepare). The limit is usually
>> decidied by the size of the buffer for vertex emitting.
> I guess I'm wondering how it does this. If the middle end says it
> supports 64k vertices, and the vertex element looks like
> [0, 128k, 64k, 32k, 96k, 16k, 1, ... ]
> what gets sent? (Sorry, I still haven't looked at the code, you could
> well have addressed this).
I see. The frontend would set
fetch_elts = [0, 128k, 64k, 32k, 96k, 16k, 1, ... ]
draw_elts = [0, 1, 2, 3, 4, 5, 6, ...]
fetch_elts is processed by the middle end and it will fetch the given
vertices. draw_elts will be passed to draw_emit or the pipeline. It
is the new index buffer, which indexes into the fetched vertices.
It is actual the same as vcache. So when fetch_elts is
[0, 128k, 64k, 64k, 128k, 16k, ...],
draw_elts would be set to
[0, 1, 2, 2, 1, 3, ...]
The number of elements to fetch (and shade) is minimized.
olv at LunarG.com
More information about the mesa-dev