[PATCH 9/9] drm/amdgpu: WIP add IOCTL interface for per VM BOs
Christian König
deathsimple at vodafone.de
Wed Aug 30 14:58:59 UTC 2017
That was a good hint. glmark2 sees a really nice 5% improvement with
this change.
Christian.
Am 30.08.2017 um 02:27 schrieb Marek Olšák:
> It might be interesting to try glmark2.
>
> Marek
>
> On Tue, Aug 29, 2017 at 3:59 PM, Christian König
> <deathsimple at vodafone.de> wrote:
>> Ok, found something that works. Xonotic in lowest resolution, lowest effects
>> quality (e.g. totally CPU bound):
>>
>> Without per process BOs:
>>
>> Xonotic 0.8:
>> pts/xonotic-1.4.0 [Resolution: 800 x 600 - Effects Quality: Low]
>> Test 1 of 1
>> Estimated Trial Run Count: 3
>> Estimated Time To Completion: 3 Minutes
>> Started Run 1 @ 21:13:50
>> Started Run 2 @ 21:14:57
>> Started Run 3 @ 21:16:03 [Std. Dev: 0.94%]
>>
>> Test Results:
>> 187.436577
>> 189.514724
>> 190.9605812
>>
>> Average: 189.30 Frames Per Second
>> Minimum: 131
>> Maximum: 355
>>
>> With per process BOs:
>>
>> Xonotic 0.8:
>> pts/xonotic-1.4.0 [Resolution: 800 x 600 - Effects Quality: Low]
>> Test 1 of 1
>> Estimated Trial Run Count: 3
>> Estimated Time To Completion: 3 Minutes
>> Started Run 1 @ 21:20:05
>> Started Run 2 @ 21:21:07
>> Started Run 3 @ 21:22:10 [Std. Dev: 1.49%]
>>
>> Test Results:
>> 203.0471676
>> 199.6622532
>> 197.0954183
>>
>> Average: 199.93 Frames Per Second
>> Minimum: 132
>> Maximum: 349
>>
>> Well that looks like some improvement.
>>
>> Regards,
>> Christian.
>>
>>
>> Am 28.08.2017 um 14:59 schrieb Zhou, David(ChunMing):
>>
>> I will push our vulkan guys to test it, their bo list is very long.
>>
>> 发自坚果 Pro
>>
>> Christian K鰊ig <deathsimple at vodafone.de> 于 2017年8月28日 下午7:55写道:
>>
>> Am 28.08.2017 um 06:21 schrieb zhoucm1:
>>>
>>> On 2017年08月27日 18:03, Christian König wrote:
>>>> Am 25.08.2017 um 21:19 schrieb Christian König:
>>>>> Am 25.08.2017 um 18:22 schrieb Marek Olšák:
>>>>>> On Fri, Aug 25, 2017 at 3:00 PM, Christian König
>>>>>> <deathsimple at vodafone.de> wrote:
>>>>>>> Am 25.08.2017 um 12:32 schrieb zhoucm1:
>>>>>>>>
>>>>>>>> On 2017年08月25日 17:38, Christian König wrote:
>>>>>>>>> From: Christian König <christian.koenig at amd.com>
>>>>>>>>>
>>>>>>>>> Add the IOCTL interface so that applications can allocate per VM
>>>>>>>>> BOs.
>>>>>>>>>
>>>>>>>>> Still WIP since not all corner cases are tested yet, but this
>>>>>>>>> reduces
>>>>>>>>> average
>>>>>>>>> CS overhead for 10K BOs from 21ms down to 48us.
>>>>>>>> Wow, cheers, eventually you get per vm bo to same reservation
>>>>>>>> with PD/pts,
>>>>>>>> indeed save a lot of bo list.
>>>>>>> Don't cheer to loud yet, that is a completely constructed test case.
>>>>>>>
>>>>>>> So far I wasn't able to archive any improvements with any real
>>>>>>> game on this
>>>>>>> with Mesa.
>>> With thinking more, too many BOs share one reservation, which could
>>> result in reservation lock often is busy, if eviction or destroy also
>>> happens often in the meaning time, then which could effect VM update
>>> and CS submission as well.
>> That's exactly the reason why I've added code to the BO destroy path to
>> avoid at least some of the problems. But yeah, that's only the tip of
>> the iceberg of problems with that approach.
>>
>>> Anyway, this is very good start and try that we reduce CS overhead,
>>> especially we've seen "reduces average CS overhead for 10K BOs from
>>> 21ms down to 48us. ".
>> Actually, it's not that good. See this is a completely build up test
>> case on a kernel with lockdep and KASAN enabled.
>>
>> In reality we usually don't have so many BOs and so far I wasn't able to
>> find much of an improvement in any real world testing.
>>
>> Regards,
>> Christian.
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>>
More information about the amd-gfx
mailing list