[Mesa-dev] [PATCH] vc4: Mark BOs as purgeable when they enter the BO cache

Wed Sep 27 20:03:15 UTC 2017

On Wed, 27 Sep 2017 12:41:52 -0700
Eric Anholt <eric at anholt.net> wrote:

> Boris Brezillon <boris.brezillon at free-electrons.com> writes:
> 
> > On Wed, 27 Sep 2017 10:15:23 -0700
> > Eric Anholt <eric at anholt.net> wrote:
> >  
> >> Boris Brezillon <boris.brezillon at free-electrons.com> writes:
> >>   
> >> > On Wed, 27 Sep 2017 15:24:16 +0100
> >> > Chris Wilson <chris at chris-wilson.co.uk> wrote:
> >> >    
> >> >> Quoting Boris Brezillon (2017-09-27 15:06:53)    
> >> >> > On Wed, 27 Sep 2017 14:50:10 +0100
> >> >> > Chris Wilson <chris at chris-wilson.co.uk> wrote:
> >> >> >       
> >> >> > > Quoting Boris Brezillon (2017-09-27 14:45:17)      
> >> >> > > >  static struct vc4_bo *
> >> >> > > >  vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, const char *name)
> >> >> > > >  {
> >> >> > > > @@ -111,6 +121,11 @@ vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, const char *name)
> >> >> > > >                          return NULL;
> >> >> > > >                  }
> >> >> > > >  
> >> >> > > > +                if (vc4_bo_purgeable(bo, false)) {
> >> >> > > > +                        mtx_unlock(&cache->lock);
> >> >> > > > +                        return NULL;        
> >> >> > > 
> >> >> > > So this would just mean that the bo was purged in the meantime. Why not
> >> >> > > just try to use the next one in the cache or allocate afresh?      
> >> >> > 
> >> >> > No, this means the BO was purged and the kernel failed to allocate the
> >> >> > memory back. We don't care about the retained status here, because we
> >> >> > don't need to restore BO's content, that's why we're not checking
> >> >> > arg.retained in vc4_bo_purgeable(). Allocating a fresh BO is likely to
> >> >> > fail with the same ENOMEM error because both path use the CMA mem.      
> >> >> 
> >> >> Hmm, you don't treat purging as permanent. But you do track the lose of
> >> >> contents, so retained is false?    
> >> >
> >> > vc4_bo_purgeable() is not reporting the retained status, it just
> >> > reports whether the BO can be used or not. I can change
> >> > vc4_bo_purgeable() semantic to return 1 if the BO content was retained,
> >> > 0 if it was purged and -1 if you the ioctl returned an error (ENOMEM)
> >> > if you prefer, but in the end, all I'll check here is
> >> > 'vc4_bo_purgeable() >= 0' because I don't don't care about the retained
> >> > status in this specific use case, all I care about is whether the BO can
> >> > be re-used or not (IOW, is there a valid CMA region attached to the BO).
> >> >    
> >> >> 
> >> >> I took a harder line, and said that userspace should recreate the object
> >> >> from scratch after it was purged. I thought that would be easier
> >> >> overall. But maybe not.:)    
> >> >
> >> > Well, maybe I'm wrong in how I implemented this
> >> > DRM_IOCTL_VC4_GEM_MADVISE ioctl, but right now, when the BO has been
> >> > purged and someone marks it back as unpurgeable I'm trying to
> >> > re-allocate BO's buffer in the ioctl path, and if the CMA allocation
> >> > fails I return -ENOMEM. I could move the allocation in the fault
> >> > handler, but this would result in pretty much the same behavior except
> >> > it would require an extra page-fault to realize the memory is not
> >> > available or force us to check the retained status and decide to
> >> > release the BO object from the BO cache.    
> >> 
> >> Hmm.  The downside I see to this plan is if we eventually decide to have
> >> the purge operation not clear all the BOs, then we would probably rather
> >> have userspace freeing objects that had been purged until it finds one
> >> in the cache that hadn't been purged, rather than forcing reallocation
> >> of this BO now (and possibly then purging something from elsewhere in
> >> the cache).  
> >
> > Okay, that's a good reason to move dma_alloc_wc() in the page-fault
> > path. I need to change a bit the implementation to check cma_gem->vaddr
> > value instead of checking bo->madv != __VC4_MADV_PURGED, otherwise we
> > might pass a non-allocated BO to the GPU/Display-Engine.  
> 
> Huh, allocation in the page-fault path?  We would need the storage to be
> definitely be available at the point that we've set it back to WILLNEED.
> Otherwise I'll "allocate" the BO from the cache, go to fill it through
> my mapping, and sigbus when CMA says we're out of memory.

Yep, I find that weird too, but that's unfortunately the only way we can
achieve what you want to do.

The only solution to know the ->retained status is by asking the the DRM
driver to put the BO in WILLNEED or DONTNEED state. If you send ->madv
= DONTNEED, and the kernel returns ->retained = true, this ->retained
state may not be valid anymore when you get back to the application,
because someone else may have triggered a purge. If you send ->madv =
WILLNEED then the ->retained state is guaranteed to be valid until you
explicitly switch back to DONTNEED, but that also means the driver has
already allocated the memory if ->retained is false, so it's already
too late to do what you were suggesting (evict the BO from the
userspace cache to avoid purging other purgeable BOs).