[Intel-gfx] [PATCH v3 1/2] vfio: Replace the DMA unmapping notifier with a callback
Cornelia Huck
cohuck at redhat.com
Wed Jul 20 07:47:12 UTC 2022
On Tue, Jul 19 2022, Jason Gunthorpe <jgg at nvidia.com> wrote:
> On Thu, Jul 07, 2022 at 03:37:16PM -0600, Alex Williamson wrote:
>> On Mon, 4 Jul 2022 21:59:03 -0300
>> Jason Gunthorpe <jgg at nvidia.com> wrote:
>> > diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
>> > index b49e2e9db2dc6f..09e0ce7b72324c 100644
>> > --- a/drivers/s390/cio/vfio_ccw_ops.c
>> > +++ b/drivers/s390/cio/vfio_ccw_ops.c
>> > @@ -44,31 +44,19 @@ static int vfio_ccw_mdev_reset(struct vfio_ccw_private *private)
>> > return ret;
>> > }
>> >
>> > -static int vfio_ccw_mdev_notifier(struct notifier_block *nb,
>> > - unsigned long action,
>> > - void *data)
>> > +static void vfio_ccw_dma_unmap(struct vfio_device *vdev, u64 iova, u64 length)
>> > {
>> > struct vfio_ccw_private *private =
>> > - container_of(nb, struct vfio_ccw_private, nb);
>> > -
>> > - /*
>> > - * Vendor drivers MUST unpin pages in response to an
>> > - * invalidation.
>> > - */
>> > - if (action == VFIO_IOMMU_NOTIFY_DMA_UNMAP) {
>> > - struct vfio_iommu_type1_dma_unmap *unmap = data;
>> > -
>> > - if (!cp_iova_pinned(&private->cp, unmap->iova))
>> > - return NOTIFY_OK;
>> > + container_of(vdev, struct vfio_ccw_private, vdev);
>> >
>> > - if (vfio_ccw_mdev_reset(private))
>> > - return NOTIFY_BAD;
>> > + /* Drivers MUST unpin pages in response to an invalidation. */
>> > + if (!cp_iova_pinned(&private->cp, iova))
>> > + return;
>> >
>> > - cp_free(&private->cp);
>> > - return NOTIFY_OK;
>> > - }
>> > + if (vfio_ccw_mdev_reset(private))
>> > + return;
>> >
>> > - return NOTIFY_DONE;
>> > + cp_free(&private->cp);
>> > }
>>
>>
>> The cp_free() call is gone here with [1], so I think this function now
>> just ends with:
>>
>> ...
>> vfio_ccw_mdev_reset(private);
>> }
>>
>> There are also minor contextual differences elsewhere from that series,
>> so a quick respin to record the changes on list would be appreciated.
>>
>> However the above kind of highlights that NOTIFY_BAD that silently gets
>> dropped here. I realize we weren't testing the return value of the
>> notifier call chain and really we didn't intend that notifiers could
>> return a failure here, but does this warrant some logging or suggest
>> future work to allow a device to go offline here? Thanks.
>
> It looks like no.
>
> If the FSM trapped in a bad state here, such as
> VFIO_CCW_STATE_NOT_OPER, then it means it should have already unpinned
> the pages and this is considered a success for this purpose
A rather pathological case would be a subchannel that cannot be
quiesced and does not end up being non-operational; in theory, the
hardware could still try to access the buffers we provided for I/O. I'd
say that is extremely unlikely, we might log it, but really cannot do
anything else.
>
> The return code here exists only to return to userspace so it can
> detect during a VFIO_DEVICE_RESET that the device has crashed
> irrecoverably.
Does it imply only that ("it's dead, Jim"), or can it also imply a
runaway device? Not that userspace can do much in any case.
>
> Thus just continuing to silently ignore it seems like the best thing.
>
> Jason
More information about the Intel-gfx
mailing list