[PATCH 2/2] drm/xe: Pick correct userptr VMA to repin on REMAP op failure

Matthew Brost matthew.brost at intel.com
Thu Feb 1 19:26:18 UTC 2024


On Thu, Feb 01, 2024 at 08:18:52PM +0100, Maarten Lankhorst wrote:
> 
> 
> On 2024-02-01 01:48, Matthew Brost wrote:
> > A REMAP op is composed of 3 VMA's - unmap, prev map, and next map. When
> > op_execute fails with -EAGAIN we need to update the local VMA pointer to
> > the current op state and then repin the VMA if it is a userptr.
> > 
> > Fixes a failure seen in xe_vm.munmap-style-unbind-userptr-one-partial.
> > 
> > Fixes: b06d47be7c83 ("drm/xe: Port Xe to GPUVA")
> > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_vm.c | 22 +++++++++++++++++-----
> >   1 file changed, 17 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > index e55161136490..2ab863fe7d0a 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -2506,13 +2506,25 @@ static int __xe_vma_op_execute(struct xe_vm *vm, struct xe_vma *vma,
> >   	}
> >   	drm_exec_fini(&exec);
> > -	if (err == -EAGAIN && xe_vma_is_userptr(vma)) {
> > +	if (err == -EAGAIN) {
> >   		lockdep_assert_held_write(&vm->lock);
> > -		err = xe_vma_userptr_pin_pages(vma);
> > -		if (!err)
> > -			goto retry_userptr;
> > -		trace_xe_vma_fail(vma);
> > +		if (op->base.op == DRM_GPUVA_OP_REMAP) {
> > +			if (!op->remap.unmap_done)
> > +				vma = gpuva_to_vma(op->base.remap.unmap->va);
> > +			else if (op->remap.prev)
> > +				vma = op->remap.prev;
> > +			else
> > +				vma = op->remap.next;
> > +		}
> I see this same vma picking in handling of DRM_GPUVA_OP_REMAP.
> 
> Could the switch in xe_vma_op_execute() be moved to a separate pick_vma
> function instead, called from this place too?
> 
> It might make the code slightly more readable.
> 

I would agree if this code wasn't going get rewritten shortly in [1]. We
are transiting to 1 job per VM bind IOCTL in [1]. I currently am
reworking on rebasing that code and found a few bugs in the current
code. I want to stablize the code quickly so I czn reliably test my
larger changes.

Would it help if I added comment here saying this code is temporary?

Matt

[1] https://patchwork.freedesktop.org/series/125608/ 

> Cheers,
> ~Maarten


More information about the Intel-xe mailing list