[Mesa-dev] [PATCH 1/7] gallium: add pipe_blend_state::srgb_enable and the CAP

Wed Jun 14 20:21:12 UTC 2017

On Wed, Jun 14, 2017 at 10:13 PM, Jose Fonseca <jfonseca at vmware.com> wrote:
> On 14/06/17 21:07, Marek Olšák wrote:
>>
>> On Wed, Jun 14, 2017 at 9:45 PM, Jose Fonseca <jfonseca at vmware.com> wrote:
>>>
>>> On 14/06/17 17:12, Marek Olšák wrote:
>>>>
>>>>
>>>> On Tue, Jun 13, 2017 at 3:43 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>
>>>>>
>>>>> On Tue, Jun 13, 2017 at 1:40 PM, Jose Fonseca <jfonseca at vmware.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> On 12/06/17 22:56, Marek Olšák wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 12, 2017 at 10:43 PM, Jose Fonseca <jfonseca at vmware.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/06/17 21:25, Marek Olšák wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jun 12, 2017 at 9:51 PM, Jose Fonseca <jfonseca at vmware.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> How does this help exactly?
>>>>>>>>>>
>>>>>>>>>> Are applications actually rendering to the same FBO w/ and w/o
>>>>>>>>>> SRGB
>>>>>>>>>> decoding?
>>>>>>>>>>
>>>>>>>>>> Or is the problem here GL_SRGB_WRITE state getting spuriously
>>>>>>>>>> dirtied
>>>>>>>>>> by
>>>>>>>>>> the
>>>>>>>>>> application?
>>>>>>>>>>
>>>>>>>>>> And even if they do, why is toggling surface views in framebuffer
>>>>>>>>>> state
>>>>>>>>>> so
>>>>>>>>>> expensive?
>>>>>>>>>>
>>>>>>>>>> I don't object per se, but it looks like an unusual thing to
>>>>>>>>>> optimize
>>>>>>>>>> for.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> set_framebuffer_state is basically a memory barrier. We have
>>>>>>>>> different
>>>>>>>>> caches between FB and textures and we have to flush them when a
>>>>>>>>> texture is unbound from the framebuffer and set as a sampler view.
>>>>>>>>> To
>>>>>>>>> keep thing simple, set_framebuffer_state is the barrier. When we
>>>>>>>>> change the blend state, the barrier is avoided. Note that the
>>>>>>>>> barrier
>>>>>>>>> makes set_framebuffer_state a function that is always GPU-bound.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I see.
>>>>>>>>
>>>>>>>> And you're sure that the incoming set_framebuffer_state are not
>>>>>>>> spurious?
>>>>>>>>
>>>>>>>> I know cso_context always eliminates redundant
>>>>>>>> pipe_context::set_framebuffer_state calls, but it is perhaps
>>>>>>>> possible
>>>>>>>> that
>>>>>>>> Mesa state tracker is reseting the framebuffer state with different
>>>>>>>> surface
>>>>>>>> views, but that in practice are exactly the same as the previous
>>>>>>>> one?
>>>>>>>>
>>>>>>>> Like I said, it seems odd apps are doing this: it doesn't make much
>>>>>>>> sense
>>>>>>>> to
>>>>>>>> me to change colorspace of the fragments between draws. (Unless some
>>>>>>>> of
>>>>>>>> the
>>>>>>>> assets are already in SRGB and the app is trying to be too smart for
>>>>>>>> its
>>>>>>>> own
>>>>>>>> good to avoid the sRGB->RGB->sRGB.)  It seems much more likely that
>>>>>>>> these
>>>>>>>> framebuffer state changes are self-inflicted some where in our
>>>>>>>> stack,
>>>>>>>> than
>>>>>>>> something truly demanded by the app.
>>>>>>>>
>>>>>>>> And if that's the case and we can fix it, then it would be a better
>>>>>>>> solution
>>>>>>>> all around.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yeah the funny part and the reason is that we have a microbenchmark
>>>>>>> in
>>>>>>> piglit (drawoverhead) changing this state between draw calls. :)
>>>>>>>
>>>>>>> Marek
>>>>>>>
>>>>>>
>>>>>> I couldn't find that piglit microbenchmark.  mesademos has
>>>>>> src/perf/drawoverhead.c but it doesn't set GL_SRGB_WRITE.  So if fbo
>>>>>> is
>>>>>> changing internally, then it's a perf bug in Mesa state tracker.
>>>>>>
>>>>>> Unless it's mimicking something that real apps do, then it's probably
>>>>>> better
>>>>>> to fix the microbenchmark to use a more realistic tests.
>>>>>
>>>>>
>>>>>
>>>>> If you build piglit, it's in bin/drawoverhead.
>>>>>
>>>>> You're right that this subtest (switching GL_FRAMEBUFFER_SRGB) is
>>>>> rather artificial and fairly unlikely to occur with real apps.
>>>>
>>>>
>>>>
>>>> FYI, I'm dropping this series and I don't have it in my repo anymore.
>>>> piglit/drawoverhead will be updated not to test this state change.
>>>>
>>>> Marek
>>>
>>>
>>>
>>> Great.
>>>
>>> BTW, I'm not sure what's a good state to change in such microbenchmark.
>>>
>>> There is of course, a myriad of states to pick, but they are not all the
>>> same: performance can vary wildly depending on the choice.   I'm not sure
>>> what's a good representative state change in such circumstances Perhaps
>>> toggling between two texture objects? Or some sampler state?
>>
>>
>> If you've ever run the microbenchmark, you know there are plenty of
>> state changes tested. I think there are like 15 state changes tested
>> in about 60 subtests at the moment. I'm adding more tests into it.
>> Currently I have 100 subtests in there locally. At the moment the
>> missing subtests are mostly just shader resources: immutable textures
>> (mutable textures i.e. not TexStorage-based are already tested), TBOs,
>> images, image buffers, SSBOs (maybe), atomic counters (maybe). The
>> methodology is 1 state change followed by 1 draw call in a loop,
>> measuring the number of draw calls per second for that case, and
>> comparing with the baseline draw rate (which is without the state
>> change).
>>
>> Marek
>>
>
> I just ran it.  Pretty neat!  I didn't know we were adding benchmarks to
> piglit.

That's because piglit has a very convenient window system integration
framework that I refuse to re-invent elsewhere.

Marek