[Intel-gfx] [PATCH v2 0/9] Move vfio_ccw to the new mdev API

Cornelia Huck cohuck at redhat.com
Fri Sep 17 11:59:16 UTC 2021


On Tue, Sep 14 2021, Jason Gunthorpe <jgg at nvidia.com> wrote:

> On Mon, Sep 13, 2021 at 04:31:54PM -0400, Eric Farman wrote:
>> > I rebased it and fixed it up here:
>> > 
>> > https://github.com/jgunthorpe/linux/tree/vfio_ccw
>> > 
>> > Can you try again?
>> 
>> That does address the crash, but then why is it processing a BROKEN
>> event? Seems problematic. 
>
> The stuff related to the NOT_OPER looked really wonky to me. I'm
> guessing this is the issue - not sure about the pmcw.ena either..

[I have still not been able to digest the whole series, sorry.]

>
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> index 5ea392959c0711..0d4d4f425befac 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -380,29 +380,19 @@ static void fsm_open(struct vfio_ccw_private *private,
>  	spin_unlock_irq(sch->lock);
>  }
>  
> -static void fsm_close(struct vfio_ccw_private *private,
> -		      enum vfio_ccw_event event)
> +static int flush_sch(struct vfio_ccw_private *private)
>  {
>  	struct subchannel *sch = private->sch;
>  	DECLARE_COMPLETION_ONSTACK(completion);
>  	int iretry, ret = 0;
>  
> -	spin_lock_irq(sch->lock);
> -	if (!sch->schib.pmcw.ena)
> -		goto err_unlock;
> -	ret = cio_disable_subchannel(sch);
> -	if (ret != -EBUSY)
> -		goto err_unlock;
> -
>  	iretry = 255;
>  	do {
> -
>  		ret = cio_cancel_halt_clear(sch, &iretry);
> -
>  		if (ret == -EIO) {
>  			pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n",
>  			       sch->schid.ssid, sch->schid.sch_no);
> -			break;
> +			return ret;

Looking at this, I wonder why we had special-cased -EIO -- for -ENODEV
we should be done as well, as then the device is dead and we do not need
to disable it.

>  		}
>  
>  		/*
> @@ -413,13 +403,28 @@ static void fsm_close(struct vfio_ccw_private *private,
>  		spin_unlock_irq(sch->lock);
>  
>  		if (ret == -EBUSY)
> -			wait_for_completion_timeout(&completion, 3*HZ);
> +			wait_for_completion_timeout(&completion, 3 * HZ);
>  
>  		private->completion = NULL;
>  		flush_workqueue(vfio_ccw_work_q);
>  		spin_lock_irq(sch->lock);
>  		ret = cio_disable_subchannel(sch);
>  	} while (ret == -EBUSY);
> +	return ret;
> +}
> +
> +static void fsm_close(struct vfio_ccw_private *private,
> +		      enum vfio_ccw_event event)
> +{
> +	struct subchannel *sch = private->sch;
> +	int ret;
> +
> +	spin_lock_irq(sch->lock);
> +	if (!sch->schib.pmcw.ena)
> +		goto err_unlock;
> +	ret = cio_disable_subchannel(sch);

cio_disable_subchannel() should be happy to disable an already disabled
subchannel, so I guess we can just walk through this and end up in
CLOSED state... unless entering with !ena actually indicates that we
messed up somewhere else in the state machine. I still need to find time
to read the patches.

> +	if (ret == -EBUSY)
> +		ret = flush_sch(private);
>  	if (ret)
>  		goto err_unlock;
>  	private->state = VFIO_CCW_STATE_CLOSED;



More information about the Intel-gfx mailing list