[Mesa-dev] [PATCH 1/3] gallium: add basic support for ARB_sample_shading

Tue Apr 1 07:25:42 PDT 2014

Am 01.04.2014 02:13, schrieb Ilia Mirkin:
> On Mon, Mar 31, 2014 at 7:42 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> I've got some ideas but I'm not sure the design would be really better -
>> I'd like to hear some other opinions.
>>
>> Am 31.03.2014 02:52, schrieb Ilia Mirkin:
>>> ---
>>>  src/gallium/auxiliary/tgsi/tgsi_strings.c  |  5 ++++-
>>>  src/gallium/docs/source/context.rst        |  3 +++
>>>  src/gallium/docs/source/screen.rst         |  3 +++
>>>  src/gallium/docs/source/tgsi.rst           | 20 ++++++++++++++++++++
>>>  src/gallium/include/pipe/p_context.h       |  3 +++
>>>  src/gallium/include/pipe/p_defines.h       |  3 ++-
>>>  src/gallium/include/pipe/p_shader_tokens.h |  5 ++++-
>>>  7 files changed, 39 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c b/src/gallium/auxiliary/tgsi/tgsi_strings.c
>>> index b0ba3ef..2be726c 100644
>>> --- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
>>> +++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
>>> @@ -81,7 +81,10 @@ const char *tgsi_semantic_names[TGSI_SEMANTIC_COUNT] =
>>>     "PCOORD",
>>>     "VIEWPORT_INDEX",
>>>     "LAYER",
>>> -   "CULLDIST"
>>> +   "CULLDIST",
>>> +   "SAMPLEID",
>>> +   "SAMPLEPOS",
>>> +   "SAMPLEMASK"
>>>  };
>>>
>>>  const char *tgsi_texture_names[TGSI_TEXTURE_COUNT] =
>>> diff --git a/src/gallium/docs/source/context.rst b/src/gallium/docs/source/context.rst
>>> index 8e14522..505995f 100644
>>> --- a/src/gallium/docs/source/context.rst
>>> +++ b/src/gallium/docs/source/context.rst
>>> @@ -67,6 +67,9 @@ objects. They all follow simple, one-method binding calls, e.g.
>>>    which are used as comparison values in stencil test.
>>>  * ``set_blend_color``
>>>  * ``set_sample_mask``
>>> +* ``set_sample_shading`` sets the minimum percentage of samples that must be
>>> +  run. 0 indicates per-fragment shading, while 1 should guarantee full
>>> +  per-sample shading.
>>>  * ``set_clip_state``
>>>  * ``set_polygon_stipple``
>>>  * ``set_scissor_states`` sets the bounds for the scissor test, which culls
>>> diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst
>>> index 09f809d..8530eb4 100644
>>> --- a/src/gallium/docs/source/screen.rst
>>> +++ b/src/gallium/docs/source/screen.rst
>>> @@ -191,6 +191,9 @@ The integer capabilities:
>>>  * ``PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT``: Whether
>>>    PIPE_TRANSFER_PERSISTENT and PIPE_TRANSFER_COHERENT are supported
>>>    for buffers.
>>> +* ``PIPE_CAP_SAMPLE_SHADING``: Whether there is support for per-sample
>>> +  shading. The context->set_sample_shading function will be expected to be
>>> +  implemented.
>>>
>>>
>>>  .. _pipe_capf:
>>> diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
>>> index e184b2f..fa30d08 100644
>>> --- a/src/gallium/docs/source/tgsi.rst
>>> +++ b/src/gallium/docs/source/tgsi.rst
>>> @@ -2600,6 +2600,26 @@ distances and by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT
>>>  which specifies the maximum number of registers which can be
>>>  annotated with those semantics.
>>>
>>> +TGSI_SEMANTIC_SAMPLEID
>>> +""""""""""""""""""""""
>>> +
>>> +For fragment shaders, this semantic label indicates that a system value
>>> +contains the current sample id (i.e. gl_SampleID). Only the X value is used.
>>> +
>>> +TGSI_SEMANTIC_SAMPLEPOS
>>> +"""""""""""""""""""""""
>>> +
>>> +For fragment shaders, this semantic label indicates that a system value
>>> +contains the current sample's position (i.e. gl_SamplePosition). Only the X
>>> +and Y values are used.
>>> +
>>> +TGSI_SEMANTIC_SAMPLEMASK
>>> +""""""""""""""""""""""""
>>> +
>>> +For fragment shaders, this semantic label indicates that an output contains
>>> +the sample mask used to disable further sample processing
>>> +(i.e. gl_SampleMask). Only the X value is used, up to 32x MS.
>>> +
>>>
>>>  Declaration Interpolate
>>>  ^^^^^^^^^^^^^^^^^^^^^^^
>>> diff --git a/src/gallium/include/pipe/p_context.h b/src/gallium/include/pipe/p_context.h
>>> index fe3045a..b753def 100644
>>> --- a/src/gallium/include/pipe/p_context.h
>>> +++ b/src/gallium/include/pipe/p_context.h
>>> @@ -190,6 +190,9 @@ struct pipe_context {
>>>     void (*set_sample_mask)( struct pipe_context *,
>>>                              unsigned sample_mask );
>>>
>>> +   void (*set_sample_shading)( struct pipe_context *,
>>> +                               float samples );
>> I don't like this being called "samples" since you're saying it's a
>> value between 0.0 and 1.0. Something like shade_frequency or some such
>> would be more appropriate imho. That said, since we're actually
>> translating float sample coverage values away should probably do the
>> same here too and just translate to a value between 0 (or 1 if you want)
>> and the number of samples in the fbo. Your hw certainly doesn't allow to
>> set a percentage number...
> 
> The downside is that this value then depends on fb state. The
> dependency has to come from somewhere, of course, but this way the API
> should get called less often. OTOH it would make my life a little
> easier to just pass the min number of samples in, instead of the
> percentage. (And I wouldn't have to refactor that _mesa_min_fragments
> function.)
Yes, but that's really the same story as with update_sample_mask which
converts coverage percentage to sample po

> 
>> I'm not entirely happy with "wasting" a whole function for a fringe
>> feature but I don't see any good alternatives (it's not exactly
>> rasterizer state, theoretically it's a shader property but in this case
>> you'd have to use a percentage value again since otherwise it's
>> obviously going to be dependent on fbo's number of samples - a shader
>> property though would probably allow this to be merged easier with the
>> code dealing with the "sample" input qualifier which also cause you to
>> run at higher frequency - though I still don't know how these
>> interactions work...).
>>
>>> +
>>>     void (*set_clip_state)( struct pipe_context *,
>>>                              const struct pipe_clip_state * );
>>>
>>> diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h
>>> index a220de0..eb3bb98 100644
>>> --- a/src/gallium/include/pipe/p_defines.h
>>> +++ b/src/gallium/include/pipe/p_defines.h
>>> @@ -551,7 +551,8 @@ enum pipe_cap {
>>>     PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS = 89,
>>>     PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS = 90,
>>>     PIPE_CAP_TEXTURE_GATHER_SM5 = 91,
>>> -   PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT = 92
>>> +   PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT = 92,
>>> +   PIPE_CAP_SAMPLE_SHADING = 93
>>>  };
>>>
>>>  #define PIPE_QUIRK_TEXTURE_BORDER_COLOR_SWIZZLE_NV50 (1 << 0)
>>> diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h
>>> index 8fa6a0a..2ce8b96 100644
>>> --- a/src/gallium/include/pipe/p_shader_tokens.h
>>> +++ b/src/gallium/include/pipe/p_shader_tokens.h
>>> @@ -167,7 +167,10 @@ struct tgsi_declaration_interp
>>>  #define TGSI_SEMANTIC_VIEWPORT_INDEX 21 /**< viewport index */
>>>  #define TGSI_SEMANTIC_LAYER      22 /**< layer (rendertarget index) */
>>>  #define TGSI_SEMANTIC_CULLDIST   23
>>> -#define TGSI_SEMANTIC_COUNT      24 /**< number of semantic values */
>>> +#define TGSI_SEMANTIC_SAMPLEID   24
>>> +#define TGSI_SEMANTIC_SAMPLEPOS  25
>> Hmm I'm not sure this is the right way to go. Note that d3d10.1+
>> actually has an instruction for getting sample positions (you just feed
>> it the sample index from the sample_id value if you want the current
>> sample position, but of course this also allows you to query OTHER
>> sample positions, something which this does not). From a quick look it
>> seems your nv50 hardware operates that way anyway, so if that's the case
>> for other hw too could just as well use that for gallium.
> 
> nv50 hardware has no way of retrieving the sample position. But we
> know the position that a particular sample id is at, and can look it
> up in a constbuf prepared for just such an occasion. I'm not 100% sure
> if it's the same situation on nvc0+, but from a quick glance at the
> ISA, it seems that way. The sample id is stored in a special register
> on nv50, and is retrieved with a special instruction on nvc0+ (which
> additionally takes some unknown-to-me argument, some resource id of
> some sort it seems?).
> 
>> (There are instructions TGSI_OPCODE_SAMPLEINFO and TGSI_OPCODE_SAMPLEPOS
>> for such purposes already but they are rather unimplemented. I'm not
>> sure you could use them easily, since d3d10.1+ actually uses them to
>> query the properties of resources too not just the rasterizer, you
>> either give it a "rasterizer reg" or a resource reg, but resource regs
>> aren't used in GL because you always use sampler regs, and so far
>> rasterizer regs aren't implemented in gallium, so I guess there'd be
>> some work to do, I'm not sure it's the right direction neither.)
> 
> AFAIK fbo state isn't passed into TGSI shaders at all right now, which
> would have to change for this to happen.
Yeah maybe it's not a very good idea.

> It seems like a larger
> undertaking, potentially outside my abilities without a lot of
> guidance. Also FWIW, gl_NumSamples gets converted to a constbuf lookup
> somewhere upstream of mesa/st (I was a bit surprised by that). Perhaps
> the sample position should be as well? Or is nvidia hardware unusual
> in not having an instruction to look up sample position?
I don't know the latter but I suspect the gl_NumSamples being a constbuf
lookup is a bit a limited solution (since in d3d10 you can query
resources that way, and they can of course have different number of
samples - I'd not be surprised if some hw could get that number directly
neither. Though for d3d10 needs we could always do something completely
different for asking sample position of rasterizer vs. asking sample
position of resources.

> 
>>
>>> +#define TGSI_SEMANTIC_SAMPLEMASK 26
>>> +#define TGSI_SEMANTIC_COUNT      27 /**< number of semantic values */
>>>
>>>  struct tgsi_declaration_semantic
>>>  {
>>>
>>
>> Otherwise seems reasonable
> 
> Great, thanks for your comments!
> 
>   -ilia
>