[PATCH v10 07/11] drm/etnaviv: Add support for the dma coherent device

Wed Jun 21 15:54:46 UTC 2023

Hi,

On 2023/6/21 23:33, Lucas Stach wrote:
> Am Mittwoch, dem 21.06.2023 um 23:00 +0800 schrieb Sui Jingfeng:
>> On 2023/6/21 18:00, Lucas Stach wrote:
>>>>    static inline enum dma_data_direction etnaviv_op_to_dma_dir(u32 op)
>>>> @@ -369,6 +381,7 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
>>>>    {
>>>>    	struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
>>>>    	struct drm_device *dev = obj->dev;
>>>> +	struct etnaviv_drm_private *priv = dev->dev_private;
>>>>    	bool write = !!(op & ETNA_PREP_WRITE);
>>>>    	int ret;
>>>>    
>>>> @@ -395,7 +408,7 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
>>>>    			return ret == 0 ? -ETIMEDOUT : ret;
>>>>    	}
>>>>    
>>>> -	if (etnaviv_obj->flags & ETNA_BO_CACHED) {
>>>> +	if (!priv->dma_coherent && etnaviv_obj->flags & ETNA_BO_CACHED) {
>>> Why do you need this? Isn't dma_sync_sgtable_for_cpu a no-op on your
>>> platform when the device is coherent?
>>>
>> I need this to show that our hardware is truly dma-coherent!
>>
>> I have tested that the driver still works like a charm without adding
>> this code '!priv->dma_coherent'.
>>
>>
>> But I'm expressing the idea that a truly dma-coherent just device don't
>> need this.
>>
>> I don't care if it is a no-op.
>>
>> It is now, it may not in the future.
> And that's exactly the point. If it ever turns into something more than
> a no-op on your platform, then that's probably for a good reason and a
> driver should not assume that it knows better than the DMA API
> implementation what is or is not required on a specific platform to
> make DMA work.
>
>> Even it is, the overhead of function call itself still get involved.
>>
> cpu_prep/fini aren't total fast paths, you already synchronized with
> the GPU here, potentially waiting for jobs to finish, etc. If your
> platform no-ops this then the function call will be in the noise.
>   
>> Also, we want to try flush the write buffer with the CPU manually.
>>
>>
>> Currently, we want the absolute correctness in the concept,
>>
>> not only the rendering results.
> And if you want absolute correctness then calling dma_sync_sgtable_* is
> the right thing to do, as it can do much more than just manage caches.

For our hardware, cached mapping don't need calling dma_sync_sgtable_*.

This is the the right thing to do. The hardware already guarantee it for 
use.

We may only want to call it for WC mapping BO,  please don't tangle all 
of this together.

We simply want to do the right thing.

> Right now it also provides SWIOTLB translation if needed.

SWIOTLB introduce the bounce buffer, slower the performance.

We don't need it. It should be avoid.

  I know you know everything. No sugar-coated bullets please.

>
> Regards,
> Lucas

-- 
Jingfeng