[PATCH] drm/i915/gvt: fix a bug of partially write ggtt enties

Mon Jun 11 02:31:37 UTC 2018

On 2018.06.11 09:28:51 +0800, Zhao Yan wrote:
> hi zhenyu
> Please allow me to clarify the bug in detail.
> The scenario goes like this,
> 1. guest launches a qword (8 bytes) write for changing a ggtt entry.
> 2. qemu splits the qword into 2 dwords (2x4 bytes)
> 3. kernel vfio driver/gvt traps 2 x 4 bytes writes. Because gvt translates
> guest ggtt entry to host ggtt entry every 4 bytes write, some wrong
> combination may be generated in host ggtt.
> In this scenario, it is not the guest driver but qemu that splits the
> single qword into 2 dwords. so it ensures the consecutiveness.
> 
> In guest driver, it has to write 8 bytes or 2 consecutive 4 bytes at a
> time, because of the peculiarity of ggtt entry, which stores guest memory
> addresses and it has to be 8 bytes long for 64-bit systems.
> So, it's not the guest driver's freedom not to write 8 bytes consecutively.
> 
> Think about 2 possible conditions:
> a. guest only writes lower or higher 4 bytes for a ggtt entry, and not
> write the other half any more.
> b. guest writes 4 bytes for a ggtt entry, then write lower/higher 4
> bytes for another ggtt entry, then return to write the left 4 bytes for the
> first ggtt entry.
> 
> The present bit solution, as you suggested, does cover condition b. (but I
> don't think any sane driver will write code like this).
> The present bit solution, however, cannot cover condition a, because it would
> never know the second half would never come. So, it would not write the
> first half to host ggtt, but the guest thought it had written successfully.
> Then some odd bugs may happen, and they are hard to be found out due to the
> lack of error message.
>
> But in my current solution, an error message will be printed out for both
> case a and b, so the two erratic conditions can be spotted out as soon as
> possible, also it saves memory space for about 256k.
> 

More bits could be used to track either half of entry has been updated with
present bit to behave sane to hw, which means never issue illegal addr to hw.
Error message could be provided with problem I think.

> 
> Do you think it's alright ?
> 

If we only want to fix 2x4b consecutive write issue, this could
work. If we want to guard against any evil client to have sane
behavior then more change is required.

> 
> Fri, Jun 08, 2018 at 04:19:20PM +0800, Zhenyu Wang wrote:
> > On 2018.06.08 16:09:48 +0800, Zhao Yan wrote:
> > > when guest writes ggtt entries, it could write 8 bytes a time if
> > > gtt_entry_size is 8. But, when vfio traps the writes, it could be split
> > > into 2 consecutive 4-byte writes.
> > > 
> > > If each 4-byte partial write could trigger a host ggtt write, it is very
> > > possible that a wrong combination is written to the host ggtt. E.g.
> > > the higher 4 bytes is the old value, but the lower 4 bytes is the new
> > > value, and this 8-byte combination is wrong but written to the ggtt, thus
> > > causing bugs.
> > > To handle this condition, we just record the first 4-byte write, then wait
> > > until the second 4-byte write comes and write the combined 64-bit data to
> > > host ggtt table.
> > > To save memory space, we don't keep this information for every ggtt index,
> > > we just record the last ggtt write position, and assume the two 4-byte
> > > writes come in consecutively for each vgpu.
> > > This assumption is right based on the fact that if gtt_entry_size is 8,
> > > the guest memory physical address should be 64 bits, so when writing an
> > > 8-byte address into guest ggtt, 2 consecutive 4-byte writes at the same
> > > ggtt index should be trapped.
> > 
> > Instead of assuming consecutive write behavior, could we track entry status
> > by 'present' bit? If one partial write is found, we just clear 'present' bit
> > for that entry, and only when full entry is updated, we will set 'present'.
> > We'll initialize ggtt entry for scratch, so those entries without 'present'
> > set are thought as partial updated.
> > 
> > > 
> > > Signed-off-by: Zhao Yan <yan.y.zhao at intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/gvt/gtt.c | 34 ++++++++++++++++++++++++++++++++++
> > >  drivers/gpu/drm/i915/gvt/gtt.h |  2 ++
> > >  2 files changed, 36 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
> > > index 78e55aa..8be8c38 100644
> > > --- a/drivers/gpu/drm/i915/gvt/gtt.c
> > > +++ b/drivers/gpu/drm/i915/gvt/gtt.c
> > > @@ -1591,6 +1591,7 @@ static struct intel_vgpu_mm *intel_vgpu_create_ggtt_mm(struct intel_vgpu *vgpu)
> > >  		vgpu_free_mm(mm);
> > >  		return ERR_PTR(-ENOMEM);
> > >  	}
> > > +	mm->ggtt_mm.last_partial_off = -1UL;
> > >  
> > >  	return mm;
> > >  }
> > > @@ -1615,6 +1616,7 @@ void _intel_vgpu_mm_release(struct kref *mm_ref)
> > >  		invalidate_ppgtt_mm(mm);
> > >  	} else {
> > >  		vfree(mm->ggtt_mm.virtual_ggtt);
> > > +		mm->ggtt_mm.last_partial_off = -1UL;
> > >  	}
> > >  
> > >  	vgpu_free_mm(mm);
> > > @@ -1867,6 +1869,38 @@ static int emulate_ggtt_mmio_write(struct intel_vgpu *vgpu, unsigned int off,
> > >  	memcpy((void *)&e.val64 + (off & (info->gtt_entry_size - 1)), p_data,
> > >  			bytes);
> > >  
> > > +	/* If ggtt entry size is 8 bytes, and it's split into two 4 bytes
> > > +	 * write, we assume the two 4 bytes writes are consecutive.
> > > +	 * Otherwise, we abort and report error
> > > +	 */
> > > +	if (bytes < info->gtt_entry_size) {
> > > +		if (ggtt_mm->ggtt_mm.last_partial_off == -1UL) {
> > > +			/* the first partial part*/
> > > +			ggtt_mm->ggtt_mm.last_partial_off = off;
> > > +			ggtt_mm->ggtt_mm.last_partial_data = e.val64;
> > > +			return 0;
> > > +		} else if ((g_gtt_index ==
> > > +				(ggtt_mm->ggtt_mm.last_partial_off >>
> > > +				info->gtt_entry_size_shift)) &&
> > > +			(off !=	ggtt_mm->ggtt_mm.last_partial_off)) {
> > > +			/* the second partial part */
> > > +
> > > +			int last_off = ggtt_mm->ggtt_mm.last_partial_off &
> > > +				(info->gtt_entry_size - 1);
> > > +
> > > +			memcpy((void *)&e.val64 + last_off,
> > > +				(void *)&ggtt_mm->ggtt_mm.last_partial_data +
> > > +				last_off, bytes);
> > > +
> > > +			ggtt_mm->ggtt_mm.last_partial_off = -1UL;
> > > +		} else {
> > > +			gvt_vgpu_err("failed to populate guest ggtt entry: abnormal ggtt entry write sequence, last_partial_off=%lx, offset=%x, bytes=%d, ggtt entry size=%d\n",
> > > +					ggtt_mm->ggtt_mm.last_partial_off, off,
> > > +					bytes, info->gtt_entry_size);
> > > +			return 0;
> > > +		}
> > > +	}
> > > +
> > >  	if (ops->test_present(&e)) {
> > >  		gfn = ops->get_pfn(&e);
> > >  		m = e;
> > > diff --git a/drivers/gpu/drm/i915/gvt/gtt.h b/drivers/gpu/drm/i915/gvt/gtt.h
> > > index 3792f2b..97e6264 100644
> > > --- a/drivers/gpu/drm/i915/gvt/gtt.h
> > > +++ b/drivers/gpu/drm/i915/gvt/gtt.h
> > > @@ -150,6 +150,8 @@ struct intel_vgpu_mm {
> > >  		} ppgtt_mm;
> > >  		struct {
> > >  			void *virtual_ggtt;
> > > +			unsigned long last_partial_off;
> > > +			u64 last_partial_data;
> > >  		} ggtt_mm;
> > >  	};
> > >  };
> > > -- 
> > > 1.9.1
> > > 
> > > _______________________________________________
> > > intel-gvt-dev mailing list
> > > intel-gvt-dev at lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
> > 
> > -- 
> > Open Source Technology Center, Intel ltd.
> > 
> > $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827
> 
> 

-- 
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/intel-gvt-dev/attachments/20180611/e9fa1823/attachment.sig>