[Intel-gfx] [Linaro-mm-sig] Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap

Christian König ckoenig.leichtzumerken at gmail.com
Wed Nov 23 09:39:19 UTC 2022


Am 23.11.22 um 10:30 schrieb Daniel Vetter:
> On Wed, 23 Nov 2022 at 10:06, Christian König <christian.koenig at amd.com> wrote:
>> Am 22.11.22 um 20:50 schrieb Daniel Vetter:
>>> On Tue, 22 Nov 2022 at 20:34, Jason Gunthorpe <jgg at ziepe.ca> wrote:
>>>> On Tue, Nov 22, 2022 at 08:29:05PM +0100, Daniel Vetter wrote:
>>>>> You nuke all the ptes. Drivers that move have slightly more than a
>>>>> bare struct file, they also have a struct address_space so that
>>>>> invalidate_mapping_range() works.
>>>> Okay, this is one of the ways that this can be made to work correctly,
>>>> as long as you never allow GUP/GUP_fast to succeed on the PTEs. (this
>>>> was the DAX mistake)
>>> Hence this patch, to enforce that no dma-buf exporter gets this wrong.
>>> Which some did, and then blamed bug reporters for the resulting splats
>>> :-) One of the things we've reverted was the ttm huge pte support,
>>> since that doesn't have the pmd_special flag (yet) and so would let
>>> gup_fast through.
>> The problem is not only gup, a lot of people seem to assume that when
>> you are able to grab a reference to a page that the ptes pointing to
>> that page can't change any more. And that's obviously incorrect.
>>
>> I witnessed tons of discussions about that already. Some customers even
>> modified our code assuming that and then wondered why the heck they ran
>> into data corruption.
>>
>> It's gotten so bad that I've even proposed intentionally mangling the
>> page reference count on TTM allocated pages:
>> https://patchwork.kernel.org/project/dri-devel/patch/20220927143529.135689-1-christian.koenig@amd.com/
> Yeah maybe something like this could be applied after we land this
> patch here.

I think both should land ASAP. We don't have any other way than to 
clearly point out incorrect approaches if we want to prevent the 
resulting data corruption.

> Well maybe should have the same check in gem mmap code to
> make sure no driver

Reads like the sentence is somehow cut of?

>
>> I think it would be better that instead of having special flags in the
>> ptes and vmas that you can't follow them to a page structure we would
>> add something to the page indicating that you can't grab a reference to
>> it. But this might break some use cases as well.
> Afaik the problem with that is that there's no free page bits left for
> these debug checks. Plus the pte+vma flags are the flags to make this
> clear already.

Maybe a GFP flag to set the page reference count to zero or something 
like this?

Christian.

> -Daniel



More information about the Intel-gfx mailing list