[Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

Marek Olšák maraeo at gmail.com
Wed Aug 28 03:17:22 PDT 2013


Yeah, st/mesa also compiles shaders on the first use, so we've got 3
places to fix: Wine, st/mesa, the driver.

Marek

On Wed, Aug 28, 2013 at 2:07 AM, Vadim Girlin <vadimgirlin at gmail.com> wrote:
> On 08/28/2013 02:59 AM, Marek Olšák wrote:
>>
>> First, you won't really see any significant continual difference in
>> frame rate no matter how many shader variants you have unless you are
>> very CPU-bound. The problem is shader compilation on the first use,
>> that's where you get a big hiccup. Try Skyrim for example: You have to
>> first look around and see every object that's around you and get
>> unpleasant stuttering before you can actually go on and play the game.
>> Yes, this also Wine's fault that it compiles shaders on the first use
>> too, but we don't have to be as bad as Wine, do we? Valve also
>> reported shader recompilations on the first use being a serious issue
>> with open source drivers.
>
>
> I perfectly understand that deferred compilation is exactly the problem that
> makes the games freeze due to shader compilation on first use when something
> new appears on the screen, but I don't think we can solve this problem in
> the *driver* by trying to compile early, because AFAICS currently the
> shaders are passed to the driver too late anyway, and this happens not only
> with wine. E.g. when I run Heaven in a window with "MESA_GLSL=dump
> R600_DEBUG=ps,vs", so that I can see Heaven's window and console output at
> the same time, what I see is that most of GL dumps happen while Heaven shows
> splash screen with loading progress, but most of the driver's dumps appear
> on the first frame and few more times during benchmark. It looks like
> compilation is deferred somewhere in the stack before the driver, or am I
> missing something?
>
> Vadim
>
>
>
>>
>> Marek
>>
>> On Tue, Aug 27, 2013 at 11:52 PM, Vadim Girlin <vadimgirlin at gmail.com>
>> wrote:
>>>
>>> On 08/28/2013 12:43 AM, Marek Olšák wrote:
>>>>
>>>>
>>>> Shader variants are BAD, BAD, BAD. Have you ever played an AAA game
>>>> with a Mesa driver that likes to compile shader variants on first use?
>>>> It's HORRIBLE.
>>>
>>>
>>>
>>> I don't think that shader variants are bad, but it's definitely bad when
>>> we
>>> are compiling variants that are never used. Currently glxgears compiles
>>> 18
>>> ps/vs shaders. In my branch with initial GS support [1] I switched
>>> handling
>>> of the shaders to deferred compilation, that is, shaders are compiled
>>> only
>>> before the actual draw. I found later that it's not really required for
>>> GS,
>>> but IIRC this change results in only 5 shaders being compiled for
>>> glxgears
>>> instead of 18. It seems most of the useless variants are results of state
>>> changes between creation of the shader state (initial compilation) and
>>> actual draw call.
>>>
>>> I had some concerns about increased overhead with those changes, and it's
>>> actually noticeable with drawoverhead demo, but I didn't see any
>>> regressions
>>> with a few real apps that I tested, e.g. glxgears even showed slightly
>>> better performance with these changes. Probably I also implemented it in
>>> a
>>> not very optimal way (I was mostly concentrated on GS support) and the
>>> overhead can be reduced.
>>>
>>> One more thing is duplicate shaders, I've analyzed shader dumps from
>>> Unigine
>>> Heaven 3.0 some time ago and found that from about 320 compiled shaders,
>>> only about 180 (50%) were unique, others were duplicates (detected by
>>> comparing the bytecode dumps for them in an automated way), maybe they
>>> had
>>> different shader keys (which still resulted in the same bytecode), but I
>>> suspect duplicate pipe shaders were also involved. Unfortunately I didn't
>>> have a time to investigate it more thoroughly since then.
>>>
>>> So my point is that we don't really need to eliminate shader variants,
>>> first
>>> we need to eliminate compilation of unused variants and duplicate
>>> shaders.
>>> Also we might want to consider offloading of the compilation to separate
>>> thread(s) and caching of shader binaries between runs.
>>>
>>> Vadim
>>>
>>>   [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders
>>>
>>>
>>>>
>>>> What the patch does is probably the right solution. At least
>>>> alpha-test state changes don't cause shader recompilation and
>>>> re-binding, which also negatively affects performance. Ideally we
>>>> shouldn't depend on the framebuffer state at all, but we need to
>>>> emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we
>>>> should always be fine with key.nr_cbufs forced to 8 for any shader
>>>> without that property. I expect app developers to do the right thing
>>>> and not write outputs they don't need.
>>>>
>>>> Marek
>>>>
>>>> On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger <sroland at vmware.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> Not that I'm qualified to review r600 code, but couldn't you create
>>>>> different shader variants depending on whether you need alpha test? At
>>>>> least I would assume shader exports aren't free.
>>>>>
>>>>> Roland
>>>>>
>>>>> Am 27.08.2013 19:56, schrieb Vadim Girlin:
>>>>>>
>>>>>>
>>>>>> We need to export at least one color if the shader writes it,
>>>>>> even when nr_cbufs==0.
>>>>>>
>>>>>> Signed-off-by: Vadim Girlin <vadimgirlin at gmail.com>
>>>>>> ---
>>>>>>
>>>>>> Tested on evergreen with multiple combinations of backends - no
>>>>>> regressions,
>>>>>> fixes some tests:
>>>>>>
>>>>>>     default    - fixes fb-alphatest-nocolor and
>>>>>> fb_alphatest-nocolor-ff
>>>>>>     default+sb - fixes fb-alphatest-nocolor and
>>>>>> fb_alphatest-nocolor-ff
>>>>>>     llvm       - fixes about 25 tests related to depth/stencil
>>>>>>     llvm+sb    - fixes about 300 tests (llvm's depth/stencil issues
>>>>>> and
>>>>>>                  regressions cased by reordering of exports in sb)
>>>>>>
>>>>>> With this patch, there are no regressions with default+sb vs default.
>>>>>> There is one regression with llvm+sb vs llvm -
>>>>>> fs-texturegrad-miplevels,
>>>>>> AFAICS it's a problem with llvm backend uncovered by sb -
>>>>>> SET_GRADIENTS_V/H
>>>>>> instructions are not placed in the same TEX clause with corresponding
>>>>>> SAMPLE_G.
>>>>>>
>>>>>>    src/gallium/drivers/r600/r600_shader.c | 7 ++++---
>>>>>>    1 file changed, 4 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/src/gallium/drivers/r600/r600_shader.c
>>>>>> b/src/gallium/drivers/r600/r600_shader.c
>>>>>> index 300b5c4..f7eab76 100644
>>>>>> --- a/src/gallium/drivers/r600/r600_shader.c
>>>>>> +++ b/src/gallium/drivers/r600/r600_shader.c
>>>>>> @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct
>>>>>> r600_screen
>>>>>> *rscreen,
>>>>>>         unsigned opcode;
>>>>>>         int i, j, k, r = 0;
>>>>>>         int next_pos_base = 60, next_param_base = 0;
>>>>>> +     int max_color_exports = MAX2(key.nr_cbufs, 1);
>>>>>>         /* Declarations used by llvm code */
>>>>>>         bool use_llvm = false;
>>>>>>         bool indirect_gprs;
>>>>>> @@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct
>>>>>> r600_screen *rscreen,
>>>>>>                 radeon_llvm_ctx.face_gpr = ctx.face_gpr;
>>>>>>                 radeon_llvm_ctx.r600_inputs = ctx.shader->input;
>>>>>>                 radeon_llvm_ctx.r600_outputs = ctx.shader->output;
>>>>>> -             radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs ,
>>>>>> 1);
>>>>>> +             radeon_llvm_ctx.color_buffer_count = max_color_exports;
>>>>>>                 radeon_llvm_ctx.chip_class = ctx.bc->chip_class;
>>>>>>                 radeon_llvm_ctx.fs_color_all = shader->fs_write_all &&
>>>>>> (rscreen->chip_class >= EVERGREEN);
>>>>>>                 radeon_llvm_ctx.stream_outputs = &so;
>>>>>> @@ -1440,7 +1441,7 @@ static int r600_shader_from_tgsi(struct
>>>>>> r600_screen *rscreen,
>>>>>>                 case TGSI_PROCESSOR_FRAGMENT:
>>>>>>                         if (shader->output[i].name ==
>>>>>> TGSI_SEMANTIC_COLOR) {
>>>>>>                                 /* never export more colors than the
>>>>>> number of CBs */
>>>>>> -                             if (shader->output[i].sid >=
>>>>>> key.nr_cbufs)
>>>>>> {
>>>>>> +                             if (shader->output[i].sid >=
>>>>>> max_color_exports) {
>>>>>>                                         /* skip export */
>>>>>>                                         j--;
>>>>>>                                         continue;
>>>>>> @@ -1450,7 +1451,7 @@ static int r600_shader_from_tgsi(struct
>>>>>> r600_screen *rscreen,
>>>>>>                                 output[j].type =
>>>>>> V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL;
>>>>>>                                 shader->nr_ps_color_exports++;
>>>>>>                                 if (shader->fs_write_all &&
>>>>>> (rscreen->chip_class >= EVERGREEN)) {
>>>>>> -                                     for (k = 1; k < key.nr_cbufs;
>>>>>> k++)
>>>>>> {
>>>>>> +                                     for (k = 1; k <
>>>>>> max_color_exports;
>>>>>> k++) {
>>>>>>                                                 j++;
>>>>>>                                                 memset(&output[j], 0,
>>>>>> sizeof(struct r600_bytecode_output));
>>>>>>                                                 output[j].gpr =
>>>>>> shader->output[i].gpr;
>>>>>>
>>>>> _______________________________________________
>>>>> mesa-dev mailing list
>>>>> mesa-dev at lists.freedesktop.org
>>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>>
>>>>
>>>> _______________________________________________
>>>> mesa-dev mailing list
>>>> mesa-dev at lists.freedesktop.org
>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>>
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>


More information about the mesa-dev mailing list