xwayland + radeon = consistent filesystem corruption Re: I'm the only one getting hard drive errors, right?

darxus at chaosreigns.com darxus at chaosreigns.com
Mon Sep 3 12:15:29 PDT 2012


[732715.730069] EXT4-fs error (device sda1): ext4_ext_search_left:1275: inode #21374007: comm flush-8:0: ix (10742) != EXT_FIRST_INDEX (0) (depth 1)!
[732715.730084] Aborting journal on device sda1-8.
[732715.730269] EXT4-fs (sda1): Remounting filesystem read-only
[732715.730278] EXT4-fs error (device sda1) in ext4_da_writepages:3033: IO failure
[732715.730440] EXT4-fs (sda1): ext4_da_writepages: jbd2_start: 589 pages, ino 21374007; err -30

This hasn't happened in three months.  The last time I saw it was the
last time I ran xwayland.  While correlation does not imply causation,
and it *could* be a coincidence, I'm really not willing to entertain
that possibility as realistic.

This time I used RAOF's X DDX (and updated the xwayland instructions
for Radeon / ATI to use it).  Last time I was using timon37's DDX.  I don't
know if they share code.  I don't know if they're at fault.

I was using the DRM backend.  I ran "make install" as a non-root user, and
then set weston-launch as owned by root and +s, and ran weston-launch.  I
did not have xserver set suid root.

I was playing a video on youtube in chromium vix xwayland when X crashed
(taking firefox, the only other X client I was running, out with it.)
And was chatting with folks in IRC about X's failure to respawn when I
realized my filesystem had been remounted readonly.  Then dug the above
output out of dmesg.

I was working on updating my "state of wayland" page to say that wayland
was looking pretty usable now :/

fsck said lots of scary things after rebooting, I had to manually confirm
it wanted to do many of them.  I have photos if anyone is interested
in details.  Lots of "Free blocks count wrong for group... Fix(y)?" and
"Free inodes count wrong for group... Fix(y)?"  A "Block bitmap
differences..."

I don't know for sure if I lost anything, but have not yet
seen evidence that I did.  I have pretty good backups.


12:15 < pq> either xwayland triggers some fs bug, or triggers a gfx driver
bug, which then scribbles over kernel memory - or faulty hw. Can't know.
12:16 < soreau> either way, it's a fairly serious problem

I agree with this assessment.


So far, I think it has only affected the filesystem I was using at the time
(I basically only use one partition per linux install).  So I may be
willing to do more testing on a dedicated testing partition.

This graphics card needed to go on ubuntu's grub gfxpayload blacklist,
because for some reason retaining the graphics mode from grub to X breaks
on some graphics cards, including this one.  Seems unlikely to be directly
related, just trying to provide all possibly relevant info I have.  The bug
for this was:  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/971204

I was running an up to date ubuntu oneric install.  

lspci output:

05:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Barts XT [Radeon HD 6800 Series] (prog-if 00 [VGA controller])
        Subsystem: Hightech Information System Ltd. Device 2010
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 46
        Region 0: Memory at d0000000 (64-bit, prefetchable) [size=256M]
        Region 2: Memory at fbfc0000 (64-bit, non-prefetchable) [size=128K]
        Region 4: I/O ports at e000 [size=256]
        Expansion ROM at fbfa0000 [disabled] [size=128K]
        Capabilities: <access denied>
        Kernel driver in use: radeon
        Kernel modules: radeon


I had the latest get masters of everything as of 2012-09-03 08:36 -0400.  
weston commit 8538b22ff4ad8879b4e3288be053508167562859
wayland commit 2be6e0ed142bac669398a9ad26d336666fa53216
raof's xf86-video-ati xwayland branch commit 8dc07e63eaf8909f7046bf746a119ec749352441

On 05/30, darxus at chaosreigns.com wrote:
> I was just playing with weston master under X, and timon37's radeon ddx,
> weston crashed, when I tried to delete the wayland lock I got an error
> that the filesystem was readonly, dmesg said:
> 
> [  496.347230] EXT4-fs error (device sda1): ext4_ext_search_left:1275: inode #21374007: comm flush-8:0:ix (10742) != EXT_FIRST_INDEX (0) (depth 1)!
> [  496.347236] Aborting journal on device sda1-8.
> [  496.347383] EXT4-fs (sda1): Remounting filesystem read-only
> 
> Which is pretty scary.  
> 
> I feel like this *might* have happened before when I was playing with
> weston, but I definitely don't have enough information to suggest there is
> any real correlation.  Or knowledge of what exactly is going on here to
> know if a correlation is even possible.
> 
> I had xserver set suid root, out of habit from before I fixed the bug
> complaining about not getting master, which seems like it might have made
> something like this possible.  
> 
> I do have good backups.  

-- 
"If you would be a real seeker after truth, it is necessary that at
least once in your life you doubt, as far as possible, all things."
- Rene Descartes
http://www.ChaosReigns.com


More information about the wayland-devel mailing list