[PATCH v5 02/16] drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr

Tue Jun 29 08:58:04 UTC 2021

On Tue, 29 Jun 2021 10:50:36 +0200
Daniel Vetter <daniel at ffwll.ch> wrote:

> On Tue, Jun 29, 2021 at 09:34:56AM +0200, Boris Brezillon wrote:
> > Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
> > reset. This leads to extra complexity when we need to synchronize timeout
> > works with the reset work. One solution to address that is to have an
> > ordered workqueue at the driver level that will be used by the different
> > schedulers to queue their timeout work. Thanks to the serialization
> > provided by the ordered workqueue we are guaranteed that timeout
> > handlers are executed sequentially, and can thus easily reset the GPU
> > from the timeout handler without extra synchronization.
> > 
> > v5:
> > * Add a new paragraph to the timedout_job() method
> > 
> > v3:
> > * New patch
> > 
> > v4:
> > * Actually use the timeout_wq to queue the timeout work
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon at collabora.com>
> > Reviewed-by: Steven Price <steven.price at arm.com>
> > Reviewed-by: Lucas Stach <l.stach at pengutronix.de>
> > Cc: Qiang Yu <yuq825 at gmail.com>
> > Cc: Emma Anholt <emma at anholt.net>
> > Cc: Alex Deucher <alexander.deucher at amd.com>
> > Cc: "Christian König" <christian.koenig at amd.com>  
> 
> Acked-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> 
> Also since I'm occasionally blinded by my own pride, add suggested-by: me?

Duh, it's an oversight (I thought I had that 'Suggested-by: Daniel
Vetter ...' already).

> I did spend quite a bit pondering how to untangle your various lockdep
> splats in the trd handler :-)

And I'm grateful for your help ;-).