Use of pci_map_page in nouveau, radeon TTM.

Tue Oct 1 03:34:13 PDT 2013

Am Dienstag, den 01.10.2013, 12:16 +0200 schrieb Thomas Hellstrom:
> Jerome, Konrad
> 
> Forgive an ignorant question, but it appears like both Nouveau and 
> Radeon may use pci_map_page() when populating TTMs on
> pages obtained using the ordinary (not DMA pool). These pages will, if I 
> understand things correctly, not be pages allocated with
> DMA_ALLOC_COHERENT.
> 
>  From what I understand, at least for the corresponding dma_map_page() 
> it's illegal for the CPU to access these pages without calling
> dma_sync_xx_for_cpu(). And before the device is allowed to access them 
> again, you need to call dma_sync_xx_for_device().
> So mapping for PCI really invalidates the TTM interleaved CPU / device 
> access model.
> 
That's right. The API says you need to sync for device or cpu, but on
x86 you can get away with not doing so, as on x86 the calls end up just
being WB buffer flushes.

For ARM, or similar non-coherent arches you absolutely have to do the
syncs, or you'll end up with different contents in cache vs sysram. For
my nouveau on ARM work I introduced some simple helpers to do the right
thing. And it really isn't hard doing the syncs at the right points in
time, just sync for CPU when getting a cpu_prep ioctl and then sync for
device when validating a buffer for GPU use.

Regards,
Lucas
-- 
Pengutronix e.K.                           | Lucas Stach                 |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-5076 |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |