[Intel-gfx] [PATCH 1/2] drm/i915: ban badly behaving contexts
Chris Wilson
chris at chris-wilson.co.uk
Fri Sep 6 11:18:19 CEST 2013
On Fri, Aug 30, 2013 at 04:19:28PM +0300, Mika Kuoppala wrote:
> Now when we have mechanism in place to track which context
> was guilty of hanging the gpu, it is possible to punish
> for bad behaviour.
>
> If context has recently submitted a faulty batchbuffers guilty of
> gpu hang and submits another batch which hangs gpu in quick
> succession, ban it permanently. If ctx is banned, no more
> batchbuffers will be queued for execution.
>
> There is no need for global wedge machinery anymore and
> it would be unwise to wedge the whole gpu if we have multiple
> hanging batches queued for execution. Instead just ban
> the guilty ones and carry on.
>
> v2: Store guilty ban status bool in gpu_error instead of pointers
> that might become danling before hang is declared.
>
> v3: Use return value for banned status instead of stashing state
> into gpu_error (Chris Wilson)
>
> v4: - rebase on top of fixed hang stats api
> - add define for ban period
> - rename commit and improve commit msg
>
> v5: - rely context banning instead of wedging the gpu
> - beautification and fix for ban calculation (Chris)
>
> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
I like this a lot. It is a big step away from our global policy and
makes the banning easier to comprehend.
Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list