[Mesa-dev] [PATCH] radeonsi: don't use fast color clear for small images even on APUs

Bas Nieuwenhuizen bas at basnieuwenhuizen.nl
Thu Dec 28 15:02:45 UTC 2017


On Thu, Dec 28, 2017 at 3:54 PM, Marek Olšák <maraeo at gmail.com> wrote:
> On Thu, Dec 28, 2017 at 12:29 PM, Konstantin Kharlamov
> <Hi-Angel at yandex.ru> wrote:
>> I'm wondering, how is r600g different in that regard? I tried wiring up the code into evergreen_do_fast_color_clear(), both in this state and by using 256*256 — however FPS for me always varies around the same 1420.
>>
>> That said, I'm seeing lots of CPU used by Xorg, glxgears, and compton — I'm wondering if CPU cap could be the reason?
>
> r600g might benefit in the same way. glxgears requires the limit to be
> at least 300*300.

As was discussed on #radeon, his default window was much larger due to
a tiling window manager (683x768) and hence his changes did not
trigger.

- Bas
>
> Marek
>
>>
>> В письме от среда, 13 декабря 2017 г. 2:53:12 MSK пользователь Marek Olšák написал:
>>> From: Marek Olšák <marek.olsak at amd.com>
>>>
>>> Increase the limit and handle non-square images better.
>>>
>>> This makes glxgears 20% faster on APUs, and a little more on dGPUs.
>>> We all use and love glxgears.
>>> ---
>>>  src/gallium/drivers/radeonsi/si_clear.c | 9 ++++-----
>>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/radeonsi/si_clear.c b/src/gallium/drivers/radeonsi/si_clear.c
>>> index 0ac83f4..464b9d7 100644
>>> --- a/src/gallium/drivers/radeonsi/si_clear.c
>>> +++ b/src/gallium/drivers/radeonsi/si_clear.c
>>> @@ -418,26 +418,25 @@ static void si_do_fast_color_clear(struct si_context *sctx,
>>>                           sctx->b.family == CHIP_STONEY)
>>>                               tex->num_slow_clears++;
>>>               }
>>>
>>>               bool need_decompress_pass = false;
>>>
>>>               /* Use a slow clear for small surfaces where the cost of
>>>                * the eliminate pass can be higher than the benefit of fast
>>>                * clear. The closed driver does this, but the numbers may differ.
>>>                *
>>> -              * Always use fast clear on APUs.
>>> +              * This helps on both dGPUs and APUs, even small APUs like Mullins.
>>>                */
>>> -             bool too_small = sctx->screen->info.has_dedicated_vram &&
>>> -                              tex->resource.b.b.nr_samples <= 1 &&
>>> -                              tex->resource.b.b.width0 <= 256 &&
>>> -                              tex->resource.b.b.height0 <= 256;
>>> +             bool too_small = tex->resource.b.b.nr_samples <= 1 &&
>>> +                              tex->resource.b.b.width0 *
>>> +                              tex->resource.b.b.height0 <= 512 * 512;
>>>
>>>               /* Try to clear DCC first, otherwise try CMASK. */
>>>               if (vi_dcc_enabled(tex, 0)) {
>>>                       uint32_t reset_value;
>>>                       bool clear_words_needed;
>>>
>>>                       if (sctx->screen->debug_flags & DBG(NO_DCC_CLEAR))
>>>                               continue;
>>>
>>>                       /* This can only occur with MSAA. */
>>>
>>
>>
>>
>>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list