[Mesa-dev] Mesa (master): 29 new commits

Michel Dänzer michel at daenzer.net
Thu Apr 28 01:54:10 UTC 2016


On 23.04.2016 07:24, Marek Olšák wrote:
> On Fri, Apr 22, 2016 at 11:28 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>> On 22.04.2016 12:29, Nicolai Hähnle wrote:
>>> On 20.04.2016 23:02, Michel Dänzer wrote:
>>>> On 21.04.2016 02:42, Marek Olšák wrote:
>>>>> On Thu, Apr 14, 2016 at 9:29 AM, Michel Dänzer <michel at daenzer.net>
>>>>> wrote:
>>>>>>
>>>>>> Also, with the code modified to use GTT only for everything but
>>>>>> (potential) scanout buffers, the performance of Unigine Valley and the
>>>>>> Unreal Engine 4 Elemental demo is reduced by about 30%. So the premise
>>>>>> that GTT is about as fast as VRAM doesn't seem to hold true in practice
>>>>>> (at least with Kaveri and presumably other (pre-)CIK APUs; maybe it's
>>>>>> better with Carrizo and newer), which means that this change may cause
>>>>>> performance of long-running processes to drop significantly over time.
>>>>>>
>>>>>> Given all these issues, I'm afraid it may be better to revert this
>>>>>> change for now, until we have a better plan for dealing with this.
>>>>>
>>>>> Assuming you use the radeon kernel driver and you are not busy, would
>>>>> you please check whether the performance is lower on amdgpu as well?
>>>>
>>>> I am using the radeon driver, but also quite busy. Nicolai, can you try
>>>> it on your Carrizo?
>>>
>>> I don't see any difference on Unigine Valley with my Carrizo (512MB of
>>> VRAM).
>>
>> I have learned an important lesson today: the Phoronix Test Suite runner
>> eats my environment variables (and possibly babies?). So my earlier tests
>> were for nothing.
>>
>> In reality, Unigine Valley gains about 30% frame rate with the VRAM_GTT
>> placement.
> 
> It looks like we do need Kaveri results on amdgpu to know if CIK or
> radeon is the issue. Then, we can either turn it off for radeon or
> CIK.

Actually, it sounds like Nicolai may have compared different things than
I did. I compared the current code (using VRAM_GTT) with the patch
below, which uses GTT only for everything but (potential) scanout
buffers. The idea being to measure the performance of GTT vs VRAM
(assuming that most BOs end up in VRAM with VRAM_GTT), with GTT
representing the long term performance of long running apps, because
BOs will get evicted from VRAM to GTT and never moved back to VRAM
after that.

This patch decreased performance by about 30% on my Kaveri with the
radeon driver. I'll compare with amdgpu as well when I get a chance, but
it might be a while, so anybody feel free to beat me to it.


diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c b/src/gallium/drivers/radeon/r600_buffer_common.c
index 47514e9..75545d1 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -169,8 +169,13 @@ bool r600_init_resource(struct r600_common_screen *rscreen,
         * it will stay there.
         */
        if (!rscreen->info.has_dedicated_vram &&
-           res->domains == RADEON_DOMAIN_VRAM)
-               res->domains = RADEON_DOMAIN_VRAM_GTT;
+           res->domains == RADEON_DOMAIN_VRAM) {
+               if (res->b.b.bind & PIPE_BIND_SCANOUT)
+                       res->domains = RADEON_DOMAIN_VRAM_GTT;
+               else
+                       res->domains = RADEON_DOMAIN_GTT;
+               flags |= RADEON_FLAG_GTT_WC;
+       }

        if (rscreen->debug_flags & DBG_NO_WC)
                flags &= ~RADEON_FLAG_GTT_WC;



-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer


More information about the mesa-dev mailing list