[PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT
Christian König
deathsimple at vodafone.de
Mon Jul 21 01:07:14 PDT 2014
Am 19.07.2014 03:15, schrieb Michel Dänzer:
> On 19.07.2014 00:47, Christian König wrote:
>> Am 18.07.2014 05:07, schrieb Michel Dänzer:
>>>>> [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI
>>>> I'm still not very keen with this change since I still don't understand
>>>> the reason why it's faster than with GTT. Definitely needs more testing
>>>> on a wider range of systems.
>>> Sure. If anyone wants to give this patch a spin and see if they can
>>> measure any performance difference, good or bad, that would be
>>> interesting.
>>>
>>>> Maybe limit it to APUs for now?
>>> But IIRC, CPU writes to VRAM vs. write-combined GTT are actually an even
>>> bigger win with dedicated GPUs than with the Kaveri built-in GPU on my
>>> system. I suspect it may depend on the bandwidth available for PCIe vs.
>>> system memory though.
>> I've made a few tests today with the kernel part of the patches running
>> Xonotic on Ultra in 1920 x 1080.
>>
>> Without any patches I get around ~47.0fps on average with my dedicated
>> HD7870.
>>
>> Adding only "drm/radeon: Use write-combined CPU mappings of rings and
>> IBs on >= SI" and that goes down to ~45.3fps.
>>
>> Adding on to off that "drm/radeon: Use VRAM for indirect buffers on >=
>> SI" and the frame rate goes down to ~27.74fps.
> Hmm, looks like I'll need to do more benchmarking of 3D workloads as well.
>
> Alex, given those numbers, it's probably best if you remove the "Use
> write-combined CPU mappings of rings and IBs on >= SI" change from your
> tree as well for now.
I wouldn't go as far as reverting the patch. It just needs a bit more
fine tuning and that can happen in the 3.17rc cycle.
My tests clearly show that we still can use USWC for the ring buffer on
SI and probably earlier chips as well. The performance drop comes from
reading the IB content for command stream validation on SI.
Putting the IB into VRAM still doesn't seems to be a good idea on
dedicated cards, but it might actually make sense on APUs.
Regards,
Christian.
More information about the dri-devel
mailing list