[Intel-gfx] [PATCH] drm/i915: Workaround incoherence between fences and LLC across multiple CPUs
Carsten Emde
C.Emde at osadl.org
Sat Jun 8 22:41:48 CEST 2013
On 04/08/2013 11:54 AM, Daniel Vetter wrote:
> On Fri, Apr 05, 2013 at 09:03:27AM -0700, Jesse Barnes wrote:
>> On Thu, 4 Apr 2013 21:31:03 +0100
>> Chris Wilson <chris at chris-wilson.co.uk> wrote:
>>
>>> In order to fully serialize access to the fenced region and the update
>>> to the fence register we need to take extreme measures on SNB+, and
>>> manually flush writes to memory prior to writing the fence register in
>>> conjunction with the memory barriers placed around the register write.
>>>
>>> Fixes i-g-t/gem_fence_thrash
>>>
>>> v2: Bring a bigger gun
>>> v3: Switch the bigger gun for heavier bullets (Arjan van de Ven)
>>> v4: Remove changes for working generations.
>>> v5: Reduce to a per-cpu wbinvd() call prior to updating the fences.
>>> v6: Rewrite comments to ellide forgotten history.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Jon Bloomfield <jon.bloomfield at intel.com>
>>> Tested-by: Jon Bloomfield <jon.bloomfield at intel.com> (v2)
>>> Cc: stable at vger.kernel.org
>>> ---
>>> drivers/gpu/drm/i915/i915_gem.c | 28 +++++++++++++++++++++++-----
>>> 1 file changed, 23 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index fa4ea1a..8f7739e 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -2689,17 +2689,35 @@ static inline int fence_number(struct drm_i915_private *dev_priv,
>>> return fence - dev_priv->fence_regs;
>>> }
>>>
>>> +static void i915_gem_write_fence__ipi(void *data)
>>> +{
>>> + wbinvd();
>>> +}
>>> +
>>> static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj,
>>> struct drm_i915_fence_reg *fence,
>>> bool enable)
>>> {
>>> - struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>>> - int reg = fence_number(dev_priv, fence);
>>> -
>>> - i915_gem_write_fence(obj->base.dev, reg, enable ? obj : NULL);
>>> + struct drm_device *dev = obj->base.dev;
>>> + struct drm_i915_private *dev_priv = dev->dev_private;
>>> + int fence_reg = fence_number(dev_priv, fence);
>>> +
>>> + /* In order to fully serialize access to the fenced region and
>>> + * the update to the fence register we need to take extreme
>>> + * measures on SNB+. In theory, the write to the fence register
>>> + * flushes all memory transactions before, and coupled with the
>>> + * mb() placed around the register write we serialise all memory
>>> + * operations with respect to the changes in the tiler. Yet, on
>>> + * SNB+ we need to take a step further and emit an explicit wbinvd()
>>> + * on each processor in order to manually flush all memory
>>> + * transactions before updating the fence register.
>>> + */
>>> + if (HAS_LLC(obj->base.dev))
>>> + on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);
>>> + i915_gem_write_fence(dev, fence_reg, enable ? obj : NULL);
>>>
>>> if (enable) {
>>> - obj->fence_reg = reg;
>>> + obj->fence_reg = fence_reg;
>>> fence->obj = obj;
>>> list_move_tail(&fence->lru_list, &dev_priv->mm.fence_list);
>>> } else {
>>
>> I almost wish x86 just gave up and went fully weakly ordered. At least
>> then we'd know we need barriers everywhere, rather than just in
>> mystical places.
>>
>> Reviewed-by: Jesse Barnes <jbarnes at virtuousgeek.org>
>
> Queued for -next, thanks for the patch. I am a bit unhappy that the
> testcase doesn't work too well and can't reliably reproduce this on all
> affected machines. But I've tried a bit to improve things and it's very
> fickle. I guess we just have to accept this :(
Under real-time conditions when we expect deterministic response to
external and internal events at any time within a couple of
microseconds, invalidating and flushing the entire cache in a running
system is unacceptable. This is introducing latencies of several
milliseconds which was clearly shown in RT regression tests on a kernel
with this patch applied (https://www.osadl.org/?id=1543#c7602). We
therefore have to revert it in the RT patch queue - kind of a
workaround of a workaround.
Would simply be wonderful, if we could get rid of the hateful wbinvd().
-Carsten.
More information about the Intel-gfx
mailing list