[PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT

Michel Dänzer michel at daenzer.net
Thu Jul 17 20:07:56 PDT 2014


On 17.07.2014 19:09, Christian König wrote:
> Am 17.07.2014 12:01, schrieb Michel Dänzer:
>> In order to try and improve X(Shm)PutImage performance with glamor, I
>> implemented support for write-combined CPU mappings of BOs in GTT.
>>
>> This did provide a nice speedup, but to my surprise, using VRAM instead
>> of write-combined GTT turned out to be even faster in general on my
>> Kaveri machine, both for the internal GPU and for discrete GPUs.
>>
>> However, I've kept the changes from GTT to VRAM separated, in case this
>> turns out to be a loss on other setups.
>>
>> Kernel patches:
>>
>> [PATCH 1/5] drm/radeon: Remove radeon_gart_restore()
>> [PATCH 2/5] drm/radeon: Pass GART page flags to
>> [PATCH 3/5] drm/radeon: Allow write-combined CPU mappings of BOs in
>> [PATCH 4/5] drm/radeon: Use write-combined CPU mappings of rings and
> 
> Those four are Reviewed-by: Christian König <christian.koenig at amd.com>

Thanks!


>> [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI
> 
> I'm still not very keen with this change since I still don't understand
> the reason why it's faster than with GTT. Definitely needs more testing
> on a wider range of systems.

Sure. If anyone wants to give this patch a spin and see if they can
measure any performance difference, good or bad, that would be interesting.

> Maybe limit it to APUs for now?

But IIRC, CPU writes to VRAM vs. write-combined GTT are actually an even
bigger win with dedicated GPUs than with the Kaveri built-in GPU on my
system. I suspect it may depend on the bandwidth available for PCIe vs.
system memory though.


-- 
Earthling Michel Dänzer            |                  http://www.amd.com
Libre software enthusiast          |                Mesa and X developer


More information about the dri-devel mailing list