[PATCH] drm/etnaviv: bring back progress check in job timeout handler

Lucas Stach l.stach at pengutronix.de
Thu Jun 28 08:40:28 UTC 2018


Am Mittwoch, den 27.06.2018, 10:25 -0700 schrieb Eric Anholt:
> > Lucas Stach <l.stach at pengutronix.de> writes:
> 
> > When the hangcheck handler was replaced by the DRM scheduler timeout
> > handling we dropped the forward progress check, as this might allow
> > clients to hog the GPU for a long time with a big job.
> > 
> > It turns out that even reasonably well behaved clients like the
> > Armada Xorg driver occasionally trip over the 500ms timeout. Bring
> > back the forward progress check to get rid of the userspace regression.
> > 
> > We would still like to fix userspace to submit smaller batches
> > if possible, but that is for another day.
> > 
> > Fixes: 6d7a20c07760 (drm/etnaviv: replace hangcheck with scheduler timeout)
> > Signed-off-by: Lucas Stach <l.stach at pengutronix.de>
> 
> I was just wondering if there was a way to do this with the scheduler (I
> had a similar issue with GTF-GLES2.gtf.GL.acos.acos_float_vert_xvary),
> and this looks correct.

What are you thinking about? A forward progress check at sub-fence
granularity is always going to be GPU specific. The only thing that
could be shunted to the scheduler is rearming of the timer. We could do
this by changing the return type of timedout_job to something that
allows us to indicate a false-positive to the scheduler.

> As far as I can see, the fence_completed check shouldn't be necessary,
> since you'll get a cancel_delayed_work_sync() once the job finish
> happens, so you're only really protecting from a timeout not detecting
> progress in between fence signal and job finish, but we expect job
> finish to be quick.

Yes, it's really only guarding against this small window. Still I would
like to skip the overhead of stopping  and restarting the whole
scheduler in case the job managed to finish in this window. That's
probably something that could even be moved in common scheduler code,
but probably as part of a follow on cleanup, instead of this stable
patch.

> Regardless,
> 
> > Reviewed-by: Eric Anholt <eric at anholt.net>

Thanks,
Lucas


More information about the dri-devel mailing list