[Mesa-dev] RFC: ARB_sample_shading in gallium

Fri Mar 28 16:57:08 PDT 2014

On Fri, Mar 28, 2014 at 7:43 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 28.03.2014 23:57, schrieb Ilia Mirkin:
>> On Fri, Mar 28, 2014 at 6:41 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>>> Am 28.03.2014 22:56, schrieb Ilia Mirkin:
>>>> On Fri, Mar 28, 2014 at 5:47 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>>>>> Am 28.03.2014 22:18, schrieb Ilia Mirkin:
>>>>>> Hey guys,
>>>>>>
>>>>>> I was thinking of taking a shot at implementing ARB_sample_shading for
>>>>>> nv50 (well, nva3-nva8) this weekend. One of the issues is that it's
>>>>>> not implemented in gallium at all right now, so I need to pipe it
>>>>>> through somehow. I believe that the only piece of data that needs to
>>>>>> be piped through is the value returned by
>>>>>> _mesa_get_min_invocations_per_fragment, which is a function of the fp,
>>>>>> the drawbuffer, and the MS state. When that value is > 1, sample
>>>>>> shading is effectively enabled. (I guess even when it's == 1, things
>>>>>> like gl_SampleID still need to work, perhaps it's worth adding a
>>>>>> separate enabled bit too.)
>>>>>>
>>>>>> Should this single integer get its own set_* callback, similar to
>>>>>> set_sample_mask, or should it be included somewhere, e.g.
>>>>>> pipe_framebuffer_state? Or even added to the set_sample_mask call?
>>>>>>
>>>>>
>>>>> Would something like in d3d10.1 work where you simply say that inputs
>>>>> are interpolated at sample frequency? That way you can also have some
>>>>> inputs which are not interpolated at sample frequency (I thought there's
>>>>> opengl functionality for this too somewhere - even if not I'd really
>>>>> like to have that functionality in gallium). It would just need new
>>>>> interpolation mode enums.
>>>>> Though I guess this does not fully cover ARB_sample_shading - this
>>>>> extension allows you for instance to have msaa 4x, but run fs at 2x (I
>>>>> could be wrong but I don't think you can do that in d3d, I don't know if
>>>>> hw can do it presumably some can otherwise it wouldn't be in the
>>>>> extension, though it is definitely worded in a way that makes it
>>>>> possible to just run at full sample frequency).
>>>>
>>>> I have 0 familiarity with d3d, but it does indeed seem like part of
>>>> the point of ARB_sample_shading is to run on less than 100% of the
>>>> samples. This appears to be supported by NVA3+ hardware based on our
>>>> current docs in rnndb, although the current piglit tests don't really
>>>> exercise all the functionality. [I haven't checked, but I assume NVC0+
>>>> as well.] Although only 1/2/4/8 are supported, based on those docs
>>>> (e.g. you can't tell it to run on 5 samples).
>>>>
>>>> An alternative to passing in the result of
>>>> _mesa_get_min_invocations_per_fragment is to just pass the percentage
>>>> (which, I guess for D3D10.1 would either be 0 or 100?),
>>> Yes I guess it would be just 0 or 100.
>>>
>>>> and redoing
>>>> the calculation inside of gallium based on the same criteria.
>>>
>>> That would be doable too indeed.
>>> Though indeed OpenGl also allows "sample" interpolation qualifier, so it
>>> looks like we're going to need this anyway (ARB_shading_language_420pack
>>> for instance). Don't ask me though how this is supposed to work if
>>> simply enabling ARB_sample_shading already causes all inputs to be
>>> interpolated per sample anyway?
>>> The gl spec (4.4 core, end of chapter 14.3.1 and 14.3.1.1 has some
>>> explanation how it could work - so if there's at least one "sample"
>>> qualifier in the fs inputs, that causes those inputs to be evaluated per
>>> sample (which implies running the fragment shader at sample frequency).
>>> The interactions with SAMPLE_SHADING are not resolved, though, and imho
>>> anything but obvious.
>>>
>>> So if the ability to run the fragment shader at something else than
>>> per-pixel or per-sample frequency is useful, then something is needed to
>>> set this value one way or another. Otherwise new interpolation modes
>>> should do just fine and make things easier.
>>>
>>> Roland
>>
>> I believe the use-case for the partial thing is in issue #3 of the
>> ARB_sample_shading spec (although I'm not 100% sure what they're
>> talking about, they do seem to be talking about a gl_Sample*-less
>> shader). Based on the _mesa_get_min_invocations_per_fragment impl, as
>> soon as gl_Sample* gets used by the shader, it flips into per-sample
>> mode (which wasn't at all my reading of the spec, but I assume this
>> was done by people who understand things). Presumably there's some
>> benefit to doing the per-some-sample mode, otherwise the spec wouldn't
>> have introduced the MinSampleShadingARB call.
> Note that ARB_sample_shading is _older_ than the sample input qualifier
> (I think that first came with ARB_gpu_shader5), that issue #3 is solved
> with per-sample frequency just as well of course, though obviously it
> should be cheaper (and less quality) to run at some frequency between 1
> and max samples rather than max sample.
>
>> Although it'd be
>> entirely within the spec (if not efficient) to ignore it entirely and
>> just assume that it's always 0 or 1.
>>
>> I think I'm going to start by adding a set_sample_shading() call that
>> takes a [0,1] float, and see where that takes me. In any case, it
>> should be fairly simple to change should it be decided that a
>> different thing is needed.
>>
>
> IMHO it would make more sense to start with the interpolation mode
> qualifiers (because this is something that definitely needs to be done,
> as far as I can tell you could implement a (admittedly limited though
> fully correct) version of ARB_sample_shading easily on top of it).
> Though I guess best thing would probably be to figure out how the
> interactions with sample qualifiers and SAMPLE_SHADING will work in the
> end so the interface makes sense (ultimately both are needed in some
> form for GL 4.0, things like if all interpolation is done at sample
> locations if you have both sample qualifiers and SAMPLE_SHADING
> enabled). But maybe sample shading is easier to start with (doesn't
> require any glsl functionality), I certainly won't stop you :-).

Well, part of the issue that my knowledge of d3d is 0, and knowledge
of gl is epsilon :) I'm not entirely clear on what an interpolation
mode is -- these things just kinda work in nouveau and I haven't
touched them. Something to do with how varyings are read in, but...
what? :) I really don't think I'm the right person to figure out how
to properly integrate sample qualifiers, nor do I know whether they
would be supported on the hw (420pack is supported by the nvidia blob
drivers on nva8 but ARB_gs5 isn't, in case that's any indicator). It
seemed like ARB_sample_shading would be a quick and easy "win",
without my having to learn everything about everything first.

It should hopefully be relatively easy to replace the API I'd
introduce here (presumably any replacement would still have to
implement the functionality originally provided). OTOH if you're on
the cusp of doing this yourself, we can avoid the churn and I can go
work on something else for a while.

  -ilia