[Intel-gfx] [PATCH 03/15] dma-buf & drm/amdgpu: remove dma_resv workaround

Zack Rusin zackr at vmware.com
Wed Apr 20 18:41:32 UTC 2022


On Wed, 2022-04-20 at 19:40 +0200, Christian König wrote:
> 
> Am 20.04.22 um 19:38 schrieb Zack Rusin:
> > On Wed, 2022-04-20 at 09:37 +0200, Christian König wrote:
> > > ⚠ External Email
> > > 
> > > Hi Zack,
> > > 
> > > Am 20.04.22 um 05:56 schrieb Zack Rusin:
> > > > On Thu, 2022-04-07 at 10:59 +0200, Christian König wrote:
> > > > > Rework the internals of the dma_resv object to allow adding
> > > > > more
> > > > > than
> > > > > one
> > > > > write fence and remember for each fence what purpose it had.
> > > > > 
> > > > > This allows removing the workaround from amdgpu which used a
> > > > > container
> > > > > for
> > > > > this instead.
> > > > > 
> > > > > Signed-off-by: Christian König <christian.koenig at amd.com>
> > > > > Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> > > > > Cc: amd-gfx at lists.freedesktop.org
> > > > afaict this change broke vmwgfx which now kernel oops right
> > > > after
> > > > boot.
> > > > I haven't had the time to look into it yet, so I'm not sure
> > > > what's
> > > > the
> > > > problem. I'll look at this tomorrow, but just in case you have
> > > > some
> > > > clues, the backtrace follows:
> > > that's a known issue and should already be fixed with:
> > > 
> > > commit d72dcbe9fce505228dae43bef9da8f2b707d1b3d
> > > Author: Christian König <christian.koenig at amd.com>
> > > Date:   Mon Apr 11 15:21:59 2022 +0200
> > Unfortunately that doesn't seem to be it. The backtrace is from the
> > current (as of the time of sending of this email) drm-misc-next,
> > which
> > has this change, so it's something else.
> 
> Ok, that's strange. In this case I need to investigate further.
> 
> Maybe VMWGFX is adding more than one fence and we actually need to
> reserve multiple slots.

This might be helper code issue with CONFIG_DEBUG_MUTEXES set. On that config
dma_resv_reset_max_fences does: 
   fences->max_fences = fences->num_fences;
For some objects num_fences is 0 and so after max_fences and num_fences are both 0.
And then BUG_ON(num_fences >= max_fences) is triggered.

z



More information about the Intel-gfx mailing list