[Intel-gfx] [CI, v3, 1/2] drm/i915: Prevent a race during I915_GEM_MMAP ioctl with WC set

Chris Wilson chris at chris-wilson.co.uk
Thu Feb 28 22:18:32 UTC 2019


Quoting Guenter Roeck (2019-02-28 22:11:51)
> On Thu, Feb 28, 2019 at 10:01:45PM +0000, Chris Wilson wrote:
> > Quoting Guenter Roeck (2019-02-28 21:57:03)
> > > On Thu, Feb 28, 2019 at 01:32:41PM -0800, Guenter Roeck wrote:
> > > > On Thu, Feb 28, 2019 at 11:12:49AM -0800, Guenter Roeck wrote:
> > > > > Hi,
> > > > > 
> > > > > On Thu, Feb 07, 2019 at 10:54:53AM +0200, Joonas Lahtinen wrote:
> > > > > > Make sure the underlying VMA in the process address space is the
> > > > > > same as it was during vm_mmap to avoid applying WC to wrong VMA.
> > > > > > 
> > > > > > A more long-term solution would be to have vm_mmap_locked variant
> > > > > > in linux/mmap.h for when caller wants to hold mmap_sem for an
> > > > > > extended duration.
> > > > > > 
> > > > > 
> > > > > It seems like we may have a regression due to this patch. I am still
> > > > > debugging, but I have a question; please see below.
> > > > > 
> > > > > Thanks,
> > > > > Guenter
> > > > > 
> > > > > > v2:
> > > > > > - Refactor the compare function
> > > > > > 
> > > > > > Fixes: 1816f9236303 ("drm/i915: Support creation of unbound wc user mappings for objects")
> > > > > > Reported-by: Adam Zabrocki <adamza at microsoft.com>
> > > > > > Suggested-by: Linus Torvalds <torvalds at linux-foundation.org>
> > > > > > Signed-off-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > > > > > Cc: <stable at vger.kernel.org> # v4.0+
> > > > > > Cc: Akash Goel <akash.goel at intel.com>
> > > > > > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > > > > > Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> > > > > > Cc: Adam Zabrocki <adamza at microsoft.com>
> > > > > > Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > > > > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com> #v1
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/i915_gem.c | 12 +++++++++++-
> > > > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > > > index 05ce9176ac4e..52639f749908 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > > > @@ -1681,6 +1681,16 @@ i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
> > > > > >   return 0;
> > > > > >  }
> > > > > >  
> > > > > > +static inline bool
> > > > > > +__vma_matches(struct vm_area_struct *vma, struct file *filp,
> > > > > > +       unsigned long addr, unsigned long size)
> > > > > > +{
> > > > > > + if (vma->vm_file != filp)
> > > > > > +         return false;
> > > > > > +
> > > > > > + return vma->vm_start == addr && (vma->vm_end - vma->vm_start) == size;
> > > > > 
> > > > > Shouldn't this be:
> > > > >     return vma->vm_start == addr && (vma->vm_end - vma->vm_start + 1) == size;
> > > > > instead ?
> > > > > 
> > > > 
> > > > Answer is no .. because vm_end points to the first byte after the
> > > > end address.
> > > > 
> > > > The actual values are:
> > > > 
> > > > start=7d288f7f9000 end=7d288f84d000 end-start=54000 size=53400
> > > > 
> > > > meaning the size field passed in the ioctl is smaller than the total length
> > > > of the area.
> > > > 
> > > > Question is now: Is the request/ioctl indeed invalid, ie does the requested
> > > > size have to match the vma size ? This used to work until this patch was
> > > > applied, and the change causes our test code to fail (and possibly minigbm,
> > > > which is used by the test code). That doesn't mean that our code is correct
> > > > (I see some related local changes in our version of minigbm), but it is
> > > > annoying, and I am being asked to revert this patch as regression
> > > > from our kernel releases.
> > > > 
> > > 
> > > In i915_gem_create():
> > > 
> > >         size = roundup(size, PAGE_SIZE);
> > >         if (size == 0)
> > >                 return -EINVAL;
> > > 
> > > This suggests to me that the requested size can be smaller than the
> > 
> > Not really, the ABI has never handled less than page-sized requests.
> > It's a mistake from the very beginning that it was not rejected as being
> > the invalid size it was.
> > 
> > > allocated size, which in turn suggests that the check
> > >         (vma->vm_end - vma->vm_start) == size;
> > > is wrong. Either it should be
> > >         (vma->vm_end - vma->vm_start) >= size;
> > > or possibly
> > >         (vma->vm_end - vma->vm_start) == roundup(size, PAGE_SIZE);
> > > 
> > > Any comments/feedback/thoughts ?
> > 
> > It's a violation of mmap(2).
> > 
> > Is probably what we will have to do if you ring the regression bell loud
> > enough, and do not see the folly of your ways. :-p
> 
> I won't ring any bells; I don't play such games. I'll make a local change
> in our kernel to fix the problem, quoting your statement that less than
> page-sized requests were never supposed to be supported, and add a note
> that we'll have to handle this with a local patch going forward.

If you have userspace that is broken, we need to fix it. We can get away
with quietly changing ABI only so long as nobody notices. It sounds like
you have some userspace that will break if you updated the kernel; ergo
we have a problem.

We just need that as a clear statement so that we have the user impact
recorded.
-Chris


More information about the Intel-gfx mailing list