[Mesa-dev] [PATCH 1/7] gallium: add pipe_blend_state::srgb_enable and the CAP

Wed Jun 14 20:38:36 UTC 2017

On 14/06/17 21:21, Marek Olšák wrote:
> On Wed, Jun 14, 2017 at 10:13 PM, Jose Fonseca <jfonseca at vmware.com> wrote:
>> On 14/06/17 21:07, Marek Olšák wrote:
>>>
>>> On Wed, Jun 14, 2017 at 9:45 PM, Jose Fonseca <jfonseca at vmware.com> wrote:
>>>>
>>>> On 14/06/17 17:12, Marek Olšák wrote:
>>>>>
>>>>>
>>>>> On Tue, Jun 13, 2017 at 3:43 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 13, 2017 at 1:40 PM, Jose Fonseca <jfonseca at vmware.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 12/06/17 22:56, Marek Olšák wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jun 12, 2017 at 10:43 PM, Jose Fonseca <jfonseca at vmware.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/06/17 21:25, Marek Olšák wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 12, 2017 at 9:51 PM, Jose Fonseca <jfonseca at vmware.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> How does this help exactly?
>>>>>>>>>>>
>>>>>>>>>>> Are applications actually rendering to the same FBO w/ and w/o
>>>>>>>>>>> SRGB
>>>>>>>>>>> decoding?
>>>>>>>>>>>
>>>>>>>>>>> Or is the problem here GL_SRGB_WRITE state getting spuriously
>>>>>>>>>>> dirtied
>>>>>>>>>>> by
>>>>>>>>>>> the
>>>>>>>>>>> application?
>>>>>>>>>>>
>>>>>>>>>>> And even if they do, why is toggling surface views in framebuffer
>>>>>>>>>>> state
>>>>>>>>>>> so
>>>>>>>>>>> expensive?
>>>>>>>>>>>
>>>>>>>>>>> I don't object per se, but it looks like an unusual thing to
>>>>>>>>>>> optimize
>>>>>>>>>>> for.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> set_framebuffer_state is basically a memory barrier. We have
>>>>>>>>>> different
>>>>>>>>>> caches between FB and textures and we have to flush them when a
>>>>>>>>>> texture is unbound from the framebuffer and set as a sampler view.
>>>>>>>>>> To
>>>>>>>>>> keep thing simple, set_framebuffer_state is the barrier. When we
>>>>>>>>>> change the blend state, the barrier is avoided. Note that the
>>>>>>>>>> barrier
>>>>>>>>>> makes set_framebuffer_state a function that is always GPU-bound.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I see.
>>>>>>>>>
>>>>>>>>> And you're sure that the incoming set_framebuffer_state are not
>>>>>>>>> spurious?
>>>>>>>>>
>>>>>>>>> I know cso_context always eliminates redundant
>>>>>>>>> pipe_context::set_framebuffer_state calls, but it is perhaps
>>>>>>>>> possible
>>>>>>>>> that
>>>>>>>>> Mesa state tracker is reseting the framebuffer state with different
>>>>>>>>> surface
>>>>>>>>> views, but that in practice are exactly the same as the previous
>>>>>>>>> one?
>>>>>>>>>
>>>>>>>>> Like I said, it seems odd apps are doing this: it doesn't make much
>>>>>>>>> sense
>>>>>>>>> to
>>>>>>>>> me to change colorspace of the fragments between draws. (Unless some
>>>>>>>>> of
>>>>>>>>> the
>>>>>>>>> assets are already in SRGB and the app is trying to be too smart for
>>>>>>>>> its
>>>>>>>>> own
>>>>>>>>> good to avoid the sRGB->RGB->sRGB.)  It seems much more likely that
>>>>>>>>> these
>>>>>>>>> framebuffer state changes are self-inflicted some where in our
>>>>>>>>> stack,
>>>>>>>>> than
>>>>>>>>> something truly demanded by the app.
>>>>>>>>>
>>>>>>>>> And if that's the case and we can fix it, then it would be a better
>>>>>>>>> solution
>>>>>>>>> all around.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Yeah the funny part and the reason is that we have a microbenchmark
>>>>>>>> in
>>>>>>>> piglit (drawoverhead) changing this state between draw calls. :)
>>>>>>>>
>>>>>>>> Marek
>>>>>>>>
>>>>>>>
>>>>>>> I couldn't find that piglit microbenchmark.  mesademos has
>>>>>>> src/perf/drawoverhead.c but it doesn't set GL_SRGB_WRITE.  So if fbo
>>>>>>> is
>>>>>>> changing internally, then it's a perf bug in Mesa state tracker.
>>>>>>>
>>>>>>> Unless it's mimicking something that real apps do, then it's probably
>>>>>>> better
>>>>>>> to fix the microbenchmark to use a more realistic tests.
>>>>>>
>>>>>>
>>>>>>
>>>>>> If you build piglit, it's in bin/drawoverhead.
>>>>>>
>>>>>> You're right that this subtest (switching GL_FRAMEBUFFER_SRGB) is
>>>>>> rather artificial and fairly unlikely to occur with real apps.
>>>>>
>>>>>
>>>>>
>>>>> FYI, I'm dropping this series and I don't have it in my repo anymore.
>>>>> piglit/drawoverhead will be updated not to test this state change.
>>>>>
>>>>> Marek
>>>>
>>>>
>>>>
>>>> Great.
>>>>
>>>> BTW, I'm not sure what's a good state to change in such microbenchmark.
>>>>
>>>> There is of course, a myriad of states to pick, but they are not all the
>>>> same: performance can vary wildly depending on the choice.   I'm not sure
>>>> what's a good representative state change in such circumstances Perhaps
>>>> toggling between two texture objects? Or some sampler state?
>>>
>>>
>>> If you've ever run the microbenchmark, you know there are plenty of
>>> state changes tested. I think there are like 15 state changes tested
>>> in about 60 subtests at the moment. I'm adding more tests into it.
>>> Currently I have 100 subtests in there locally. At the moment the
>>> missing subtests are mostly just shader resources: immutable textures
>>> (mutable textures i.e. not TexStorage-based are already tested), TBOs,
>>> images, image buffers, SSBOs (maybe), atomic counters (maybe). The
>>> methodology is 1 state change followed by 1 draw call in a loop,
>>> measuring the number of draw calls per second for that case, and
>>> comparing with the baseline draw rate (which is without the state
>>> change).
>>>
>>> Marek
>>>
>>
>> I just ran it.  Pretty neat!  I didn't know we were adding benchmarks to
>> piglit.
> 
> That's because piglit has a very convenient window system integration
> framework that I refuse to re-invent elsewhere.

Ah, makes sense.

Which reminds me: do people think we should transition mesademos off 
glut to glfw or waffle? Or do you think we should just strive to migrate 
the stuff there to piglit?

Jose