[PATCH 3/3] drm/panthor: Prevent potential overwrite of buffer objects

Mon Nov 4 12:49:56 UTC 2024

On 11/4/24 11:16, Boris Brezillon wrote:
> Hi Akash,
> 
> On Thu, 31 Oct 2024 21:42:27 +0000
> Akash Goel <akash.goel at arm.com> wrote:
> 
>> I assume you also reckon that there is a potential problem here for arm64.
> 
> It impacts any system that's not IO-coherent I would say, and this
> comment seems to prove this is a known issue [3].
> 

Thanks for confirming.

Actually I had tried to check with Daniel Vetter about [3], as it was 
not clear to me that how that code exactly helped in x86 case.
As far as I understand, [3] updates the attribute of direct kernel 
mapping of the shmem pages to WC, so as to be consistent with the 
Userspace mapping of the pages or their vmapping inside the kernel.
But didn't get how that alignment actually helped in cleaning the dirty 
cache lines.

>>
>> Fully agree with your suggestion that the handling needs to be at the
>> drm_gem_shmem level. I was not sure if we really need to do anything, as
>> I didn't observe any overwrite issue during the testing. So thought
>> better to limit the change to Panthor and get some feedback.
> 
> Actually, I wonder if PowerVR isn't papering over the same issue with
> [1], so it looks like at least two drivers would benefit from a fix at
> the drm_gem_shmem level.
> 

Thanks for giving the reference of PowerVR code.
It is unconditionally calling dma_sync_sgtable after acquiring the 
pages, so could be papering over the issue as you suspected.

>>
>> shmem calls 'flush_dcache_folio()' after clearing the pages but that
>> just clears the 'PG_dcache_clean' bit and CPU cache is not cleaned
>> immediately.
>>
>> I realize that this patch is not foolproof, as Userspace can try to
>> populate the BO from CPU side before mapping it on the GPU side.
>>
>> Not sure if we also need to consider the case when shmem pages are
>> swapped out. Don't know if there could be a similar situation of dirty
>> cachelines after the swap in.
> 
> I think we do. We basically need to flush CPU caches any time
> pages are [re]allocated, because the shmem layer will either zero-out
> (first allocation) or populate (swap-in) in that path, and in both
> cases, it involves a CPU copy to a cached mapping.
> 

Thanks for confirming.

I think we may have to do cache flush page by page.
Not all pages might get swapped out and the initial allocation of all 
pages may not happen at the same time.
Please correct me if my understanding is wrong.

>>
>> Also not sure if dma_sync_sgtable_for_device() can be called from
>> drm_gem_shmem_get_pages() as the sg_table won't be available at that point.
> 
> Okay, that's indeed an issue. Maybe we should tie the sgt allocation to
> the pages allocation, as I can't think of a case where we would
> allocate pages without needing the sg table that goes with it. And if
> there are driver that want the sgt to be lazily allocated, we can
> always add a drm_gem_shmem_object::lazy_sgt_alloc flag.
> 

Many thanks for the suggestion.

Will try to see how we can progress this work.

Best regards
Akash

> Regards,
> 
> Boris
> 
> [1]https://elixir.bootlin.com/linux/v6.11.6/source/drivers/gpu/drm/imagination/pvr_gem.c#L363
> [2]https://elixir.bootlin.com/linux/v6.11.6/source/drivers/gpu/drm/drm_gem_shmem_helper.c#L177
> [3]https://elixir.bootlin.com/linux/v6.11.6/source/drivers/gpu/drm/drm_gem_shmem_helper.c#L185