[PATCH 13/13] drm/radeon: rework recursive gpu reset handling

Daniel Vetter daniel at ffwll.ch
Fri Apr 20 00:50:52 PDT 2012


On Fri, Apr 20, 2012 at 07:57:09AM +0100, Dave Airlie wrote:
> 2012/4/19 Christian König <deathsimple at vodafone.de>:
> > Instead of all this humpy pumpy with recursive
> > mutex (which also fixes only halve of the problem)
> > move the actual gpu reset out of the fence code,
> > return -EDEADLK and then reset the gpu in the
> > calling ioctl function.
> 
> I'm trying to figure out if this has any disadvantages over doing what
> I proposed before and just kicking a thread to reset the gpu.
> 
> It seems like this should also avoid the locking problems, I'd like to
> make sure we don't return -EDEADLK to userspace by accident anywhere,
> since I don't think it prepared for it and it would be an ABI change.

Fyi, the trick i915 uses to solve the reset problem is to bail out with
-EAGAIN and rely on drmIOCtl restarting the ioctl. This way we use the
same codepaths we use to bail out when getting a signal, and thanks to X
these are rather well-tested. The hangcheck code also fires of a work item to
do all the reset magic. In all the ioctls that might wait for the gpu we
have a fancy piece of code which checks whether a gpu reset is pending,
and if so waits for that to complete. It also checks whether the reset
succeeded and if not bails out with -EIO.
-Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48


More information about the dri-devel mailing list