i915 modeset memory corruption issues? (Fwd: Oops in ext3_block_to_path.isra.40+0x26/0x11b)
hughd at google.com
Sat Mar 17 22:43:18 PDT 2012
Added Rafael to the Cc: Rafael, we're pondering over one or more of these
recurrent threads about corruption after resume, seemingly related to i915.
On Sat, 17 Mar 2012, Keith Packard wrote:
> On Sat, 17 Mar 2012 18:44:18 -0700 (PDT), Hugh Dickins <hughd at google.com> wrote:
> > I keep worrying about the sequence when the machine is powered on again
> > after hibernation: can i915 get up to anything before it is resumed from
> > the hibernation image?
> Well, the frame buffer is presumably still using whatever mapping it had
> before suspend occurred; is there any way it could be writing through
> that before the graphics driver was resumed?
It's hibernation restore here, so I don't think it could be using the
mapping from before hibernation until after resuming from hibernation
snapshot: it would be using the rebooting kernel's mapping until then.
> What I don't understand is the relationship between the boot kernel and
> the resumed kernel; when does the boot kernel stop writing to the
> console, and how does it hand off control of the frame buffer at that
I believe the handoff point comes in the late initcall software_resume():
which loads the image and calls hibernation_restore -> resume_target_kernel
-> swsusp_arch_resume, which emerges into the restored hibernation image.
As a late initcall, I imagine some work has already been done via the
framebuffer, but I have no conception of what kind of mappings that
involves (would shmem objects come into it at all? and is that even
a relevant question, could enough damage be done without them?), nor
whether they're properly torn down before emerging into the hibernimage.
> It would be great if we could separate out the boot kernel access to the
> graphics system from the resumed system -- if the boot kernel was run
> without the i915 driver loaded at all, and just used VGA text mode, then
> any damage as a result of resume wouldn't be caused by the boot kernel
> GTT mappings getting used at the wrong time.
But you're giving my worry more credence than it deserves there:
we don't have any evidence that this is where the problem lies,
that's just a suspicion of mine at the moment.
More information about the dri-devel