User ptr horror show

David Herrmann dh.herrmann at gmail.com
Mon Jun 30 12:25:10 PDT 2014


Hi

On Mon, Jun 30, 2014 at 9:04 PM, Jerome Glisse <j.glisse at gmail.com> wrote:
> On Mon, Jun 30, 2014 at 08:47:31PM +0200, David Herrmann wrote:
>> Additionally to what AIO and Direct-IO do, intel userptr adds the
>> range_start callback to release pinned pages whenever the pages are
>> unmapped. However, anyone who truncates inode pages, schedules
>> writeback, etc., has to lock the page. Thus, any following GUP-fast
>> from userptr will fail and the slowpath will wait on mmap_sem. So I'd
>> really prefer if you could elaborate on your race?
>
> Some writback code path (and other cpu page table modificiation) will not
> call range_start but only invalidate_page. More over once the range_start
> is call a GUP that is done before range_end is call will return what ever
> it sees inside the cpu page table at the time which might be new pages or
> old pages.

range_start/end are usually called by unmap_*() functions, which
normally are called under page-lock. Therefore, _no_ new PTEs can be
established as long as the page is locked. This means, once range_end
is called, you're guaranteed any racing GUP-fast will fail and any
racing GUP will wait on the page-lock.
However, I wonder why i915 uses range_start instead of range_end. The
PTEs are still in place when range_start is called, therefore a racing
GUP-fast will still succeed.

Regarding "invalidate_page": This is called _after_ the PTE has been
removed (see rmap and try_unmap_page()), also with the page-locked.
Same scenario as range_end.
However, I also wonder why i915 doesn't have a callback for that? They
definitely need to, afaics.

Writeback code races with parallel GUP writes and keeps pages in
place. They rely on GUP-writers to call SetDirty() once they're done.
I've never liked that, but I don't see any code protecting writeback
against GUP writes, do you?

> Thus you can imagine i915 trying to use an userptr object right after a
> range_start but before a range_end, the i915 will read the page table (GUP
> is not serializing anything here) and will assume that whatever it got from
> there is current while it might just be soon to be discarded/replaced pages.
> Hence you can not garanty that pages you use are the same backing the user
> space range. Note that the mmap_sem does not protect page migration or thing
> like that. It only protect vma modifications.
>
> As i said for the gpu only accessing those buffer in read mode is fine but
> i am sure userspace will start relying on the gpu writting to those page
> and being able to read back what the gpu wrote through the user space
> mapping. This will work often but this can only work because you are lucky
> and there is no single way to make it work reliably.

i915 userptr is a 3.16 feature, right? So we can still revert it if
it's really broken.

However, I'd still like to see an example _call-trace_ with a *real
race*. I cannot see any writeback code that replaces pages despite
elevated page-refs. page-migration is quite strict about page-refs and
fails in those cases. truncate() is the only place I can see, but this
should be fixed by using range_end() instead of range_start(), right?

Thanks
David


More information about the dri-devel mailing list