[PATCH] drm/radeon: disable any GPU activity after unrecovered lockup
Michel Dänzer
michel at daenzer.net
Wed Jun 27 08:42:44 PDT 2012
On Mit, 2012-06-27 at 10:49 -0400, Jerome Glisse wrote:
> On Wed, Jun 27, 2012 at 5:19 AM, Michel Dänzer <michel at daenzer.net> wrote:
> > On Die, 2012-06-26 at 17:04 -0400, j.glisse at gmail.com wrote:
> >> From: Jerome Glisse <jglisse at redhat.com>
> >>
> >> After unrecovered GPU lockup avoid any GPU activities to avoid
> >> things like kernel segfault and alike to happen in any of the
> >> path that assume hw is working.
> >
> > Has the patch been tested and confirmed to actually fix such a problem?
>
> Yes it has been tested i dont send untested patch to ml.
I didn't expect (or mean to suggest) otherwise. I think I misread the
related IRC conversation from last night: I thought you basically
whipped up this patch in response to a report of such problems. But on
re-reading now, I guess you wrote this patch a while ago and are just
sending it now in response to the report on IRC.
> >> r = radeon_asic_reset(rdev);
> >> if (!r) {
> >> dev_info(rdev->dev, "GPU reset succeed\n");
> >> radeon_resume(rdev);
> >> - radeon_restore_bios_scratch_regs(rdev);
> >> - drm_helper_resume_force_mode(rdev->ddev);
> >> - ttm_bo_unlock_delayed_workqueue(&rdev->mman.bdev, resched);
> >> }
> >>
> >> + /* no matter what restore video mode */
> >> + radeon_restore_bios_scratch_regs(rdev);
> >> + drm_helper_resume_force_mode(rdev->ddev);
> >> + ttm_bo_unlock_delayed_workqueue(&rdev->mman.bdev, resched);
> >
> > Maybe this should be in a separate patch.
>
> Idea is to send this patch to stable thus having one patch that have it all.
That doesn't make sense. Either the changes belong into a single patch
(but then the commit log should describe all of them) or not. They can
be sent to stable[0] either way.
[0] Actually, patches with Cc: stable are picked up automagically once
they hit mainline, there's no point in sending them there directly.
> >> @@ -399,6 +418,14 @@ static int radeon_bo_move(struct ttm_buffer_object *bo,
> >> radeon_move_null(bo, new_mem);
> >> return 0;
> >> }
> >> + if (!rdev->accel_working) {
> >> + /* when accel is not working GPU is in broken state just
> >> + * do nothing for any ttm operation to avoid making the
> >> + * situation worst than it's
> >
> > 'worse than it is', same in the following two hunks.
Are you gonna fix these typos?
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Debian, X and DRI developer
More information about the dri-devel
mailing list