[Intel-gfx] [PATCH v3 1/2] vfio: Replace the DMA unmapping notifier with a callback

Cornelia Huck cohuck at redhat.com
Wed Jul 20 07:47:12 UTC 2022


On Tue, Jul 19 2022, Jason Gunthorpe <jgg at nvidia.com> wrote:

> On Thu, Jul 07, 2022 at 03:37:16PM -0600, Alex Williamson wrote:
>> On Mon,  4 Jul 2022 21:59:03 -0300
>> Jason Gunthorpe <jgg at nvidia.com> wrote:
>> > diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
>> > index b49e2e9db2dc6f..09e0ce7b72324c 100644
>> > --- a/drivers/s390/cio/vfio_ccw_ops.c
>> > +++ b/drivers/s390/cio/vfio_ccw_ops.c
>> > @@ -44,31 +44,19 @@ static int vfio_ccw_mdev_reset(struct vfio_ccw_private *private)
>> >  	return ret;
>> >  }
>> >  
>> > -static int vfio_ccw_mdev_notifier(struct notifier_block *nb,
>> > -				  unsigned long action,
>> > -				  void *data)
>> > +static void vfio_ccw_dma_unmap(struct vfio_device *vdev, u64 iova, u64 length)
>> >  {
>> >  	struct vfio_ccw_private *private =
>> > -		container_of(nb, struct vfio_ccw_private, nb);
>> > -
>> > -	/*
>> > -	 * Vendor drivers MUST unpin pages in response to an
>> > -	 * invalidation.
>> > -	 */
>> > -	if (action == VFIO_IOMMU_NOTIFY_DMA_UNMAP) {
>> > -		struct vfio_iommu_type1_dma_unmap *unmap = data;
>> > -
>> > -		if (!cp_iova_pinned(&private->cp, unmap->iova))
>> > -			return NOTIFY_OK;
>> > +		container_of(vdev, struct vfio_ccw_private, vdev);
>> >  
>> > -		if (vfio_ccw_mdev_reset(private))
>> > -			return NOTIFY_BAD;
>> > +	/* Drivers MUST unpin pages in response to an invalidation. */
>> > +	if (!cp_iova_pinned(&private->cp, iova))
>> > +		return;
>> >  
>> > -		cp_free(&private->cp);
>> > -		return NOTIFY_OK;
>> > -	}
>> > +	if (vfio_ccw_mdev_reset(private))
>> > +		return;
>> >  
>> > -	return NOTIFY_DONE;
>> > +	cp_free(&private->cp);
>> >  }
>> 
>> 
>> The cp_free() call is gone here with [1], so I think this function now
>> just ends with:
>> 
>> 	...
>> 	vfio_ccw_mdev_reset(private);
>> }
>> 
>> There are also minor contextual differences elsewhere from that series,
>> so a quick respin to record the changes on list would be appreciated.
>> 
>> However the above kind of highlights that NOTIFY_BAD that silently gets
>> dropped here.  I realize we weren't testing the return value of the
>> notifier call chain and really we didn't intend that notifiers could
>> return a failure here, but does this warrant some logging or suggest
>> future work to allow a device to go offline here?  Thanks.
>
> It looks like no.
>
> If the FSM trapped in a bad state here, such as
> VFIO_CCW_STATE_NOT_OPER, then it means it should have already unpinned
> the pages and this is considered a success for this purpose

A rather pathological case would be a subchannel that cannot be
quiesced and does not end up being non-operational; in theory, the
hardware could still try to access the buffers we provided for I/O. I'd
say that is extremely unlikely, we might log it, but really cannot do
anything else.

>
> The return code here exists only to return to userspace so it can
> detect during a VFIO_DEVICE_RESET that the device has crashed
> irrecoverably.

Does it imply only that ("it's dead, Jim"), or can it also imply a
runaway device? Not that userspace can do much in any case.

>
> Thus just continuing to silently ignore it seems like the best thing.
>
> Jason



More information about the Intel-gfx mailing list