[Mesa-dev] draw: Replace varray and vcache by vsplit

Fri Aug 13 08:24:02 PDT 2010

On Fri, Aug 13, 2010 at 11:09 PM, Chia-I Wu <olvaffe at gmail.com> wrote:
> On Fri, Aug 13, 2010 at 10:51 PM, Keith Whitwell <keithw at vmware.com> wrote:
>> On Fri, 2010-08-13 at 07:46 -0700, Chia-I Wu wrote:
>>> On Fri, Aug 13, 2010 at 10:14 PM, Keith Whitwell <keithw at vmware.com> wrote:
>>> > On Fri, 2010-08-13 at 07:04 -0700, Chia-I Wu wrote:
>>> >> Hi,
>>> >>
>>> >> There are two primitive transformations in gallium draw module.  In
>>> >> varray, primitives are "split"ted.  When a primitive has more vertices
>>> >> than the middle end can handle, varray splits the primitive and calls
>>> >> the middle end multiple times.
>>> >>
>>> >> In vcache, primitives are "decompose"d.  More advanced primitives are
>>> >> decomposed into one of point, line(_adj), or triangle(_adj).
>>> >> Similarly, vcache may call the middle end multiple times to flush its
>>> >> internal buffer.  In some cases, vcache passes the primitves through
>>> >> without decomposing nor splitting, as can be seen in vcache_check_run.
>>> >>
>>> >> The issue with vcache is that it has to decompose a primitive
>>> >> differently depending on the provoking convention, as explained in
>>> >>
>>> >>   http://lists.freedesktop.org/archives/mesa-dev/2010-August/001797.html
>>> >>
>>> >> It becomes a problem when GS is active.
>>> >>
>>> >> My proposal is to make vcache split instead of decompose.  Because
>>> >> varray only splits and vcache has a pass-through path, the rest of the
>>> >> workflow already has to support all primitive types.  Switching from
>>> >> decompose to split does not require a big change to the rest of the
>>> >> workflow.
>>> >>
>>> >> But then vcache will look a lot like varray, only with indexed
>>> >> primitive support.  It leads me to a new frontend that replaces both
>>> >> varray and vcache: vsplit
>>> >>
>>> >>  http://cgit.freedesktop.org/~olv/mesa/log/?h=draw-vsplit
>>> >>
>>> >> vsplit is based on varray.  It uses some code from vcache to support
>>> >> indexed primitives.  When vcache decomposes, there are flags being set
>>> >> to indicate that if the stipple counter should be reset or if some
>>> >> edge of a triangle should be omitted in unfilled mode.  The segments
>>> >> of a splitted primitive have flags for similar purposes too:
>>> >>
>>> >>   DRAW_SPLIT_AFTER   More segments to come after this one
>>> >>   DRAW_SPLIT_BEFORE  There are preceding segments
>>> >>
>>> >> These flags are set by vsplit and the middle ends pass them to the
>>> >> other stages.  Therefore, the run methods of middle ends are augmented
>>> >> to take the flags.
>>> >>
>>> >> To summarize, vsplit
>>> >>
>>> >>  - fixes GS when (flatshade && flatshade_first) is on
>>> >>  - never sends more vertices than the middle end claims to handle
>>> >>  - is faster than vcache: split instead of decompose, no get_elt
>>> >>    calls
>>> >>  - no longer uses the higher bits of draw_elts for stipple/edge flags
>>> >>
>>> >> Suggestions?
>>> >
>>> >
>>> > Hi - I haven't looked at the patches yet, but a couple of questions:
>>> >
>>> > How does this interact with the draw_pipe_* code - which requires
>>> > decomposed primitives?
>>> draw_pipe.c decomposes the primitives.  It is there before because it
>>> has to support varray and vcache_check_run which do not decompose.
>>
>> OK.
>>
>>> > How does this cope with indexed rendering where the vertex buffers
>>> > themselves are too large (for hardware or some other entity)?  Eg.
>>> > imagine the hardware could cope with up to 64k vertices, and you have a
>>> > drawelements call randomly referencing vertices in range 0..128k ?
>>> Vertex fetching happens in the middle end so the range of the indices
>>> is not a problem.  Though vsplit guarantees that it never calls the
>>> middle end with more vertices than the middle end claims to support
>>> (as returned by draw_pt_middle_end::prepare).  The limit is usually
>>> decidied by the size of the buffer for vertex emitting.
>>
>> I guess I'm wondering how it does this.  If the middle end says it
>> supports 64k vertices, and the vertex element looks like
>>
>>  [0, 128k, 64k, 32k, 96k, 16k, 1, ... ]
>>
>> what gets sent?  (Sorry, I still haven't looked at the code, you could
>> well have addressed this).
> I see.  The frontend would set
>
>   fetch_elts = [0, 128k, 64k, 32k, 96k, 16k, 1, ... ]
>   draw_elts = [0, 1, 2, 3, 4, 5, 6, ...]
>
> fetch_elts is processed by the middle end and it will fetch the given
> vertices.  draw_elts will be passed to draw_emit or the pipeline.  It
> is the new index buffer, which indexes into the fetched vertices.
>
> It is actual the same as vcache.  So when fetch_elts is
Should be:  So when the index buffer looks like
>   [0, 128k, 64k, 64k, 128k, 16k, ...],

fetch_elts would be set to

  [0, 128k, 64k, 16k, ...] and
> draw_elts would be set to
>
>   [0, 1, 2, 2, 1, 3, ...]
>
> The number of elements to fetch (and shade) is minimized.
>
> --
> olv at LunarG.com
>


-- 
olv at LunarG.com