[Intel-gfx] [PATCH] drm/i915: Workaround incoherence between fences and LLC across multiple CPUs
Jesse Barnes
jbarnes at virtuousgeek.org
Fri Apr 5 18:03:27 CEST 2013
On Thu, 4 Apr 2013 21:31:03 +0100
Chris Wilson <chris at chris-wilson.co.uk> wrote:
> In order to fully serialize access to the fenced region and the update
> to the fence register we need to take extreme measures on SNB+, and
> manually flush writes to memory prior to writing the fence register in
> conjunction with the memory barriers placed around the register write.
>
> Fixes i-g-t/gem_fence_thrash
>
> v2: Bring a bigger gun
> v3: Switch the bigger gun for heavier bullets (Arjan van de Ven)
> v4: Remove changes for working generations.
> v5: Reduce to a per-cpu wbinvd() call prior to updating the fences.
> v6: Rewrite comments to ellide forgotten history.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Jon Bloomfield <jon.bloomfield at intel.com>
> Tested-by: Jon Bloomfield <jon.bloomfield at intel.com> (v2)
> Cc: stable at vger.kernel.org
> ---
> drivers/gpu/drm/i915/i915_gem.c | 28 +++++++++++++++++++++++-----
> 1 file changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index fa4ea1a..8f7739e 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2689,17 +2689,35 @@ static inline int fence_number(struct drm_i915_private *dev_priv,
> return fence - dev_priv->fence_regs;
> }
>
> +static void i915_gem_write_fence__ipi(void *data)
> +{
> + wbinvd();
> +}
> +
> static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj,
> struct drm_i915_fence_reg *fence,
> bool enable)
> {
> - struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> - int reg = fence_number(dev_priv, fence);
> -
> - i915_gem_write_fence(obj->base.dev, reg, enable ? obj : NULL);
> + struct drm_device *dev = obj->base.dev;
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + int fence_reg = fence_number(dev_priv, fence);
> +
> + /* In order to fully serialize access to the fenced region and
> + * the update to the fence register we need to take extreme
> + * measures on SNB+. In theory, the write to the fence register
> + * flushes all memory transactions before, and coupled with the
> + * mb() placed around the register write we serialise all memory
> + * operations with respect to the changes in the tiler. Yet, on
> + * SNB+ we need to take a step further and emit an explicit wbinvd()
> + * on each processor in order to manually flush all memory
> + * transactions before updating the fence register.
> + */
> + if (HAS_LLC(obj->base.dev))
> + on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);
> + i915_gem_write_fence(dev, fence_reg, enable ? obj : NULL);
>
> if (enable) {
> - obj->fence_reg = reg;
> + obj->fence_reg = fence_reg;
> fence->obj = obj;
> list_move_tail(&fence->lru_list, &dev_priv->mm.fence_list);
> } else {
I almost wish x86 just gave up and went fully weakly ordered. At least
then we'd know we need barriers everywhere, rather than just in
mystical places.
Reviewed-by: Jesse Barnes <jbarnes at virtuousgeek.org>
--
Jesse Barnes, Intel Open Source Technology Center
More information about the Intel-gfx
mailing list