xwayland + radeon = consistent filesystem corruption Re: I'm the only one getting hard drive errors, right?
darxus at chaosreigns.com
darxus at chaosreigns.com
Mon Sep 3 12:15:29 PDT 2012
[732715.730069] EXT4-fs error (device sda1): ext4_ext_search_left:1275: inode #21374007: comm flush-8:0: ix (10742) != EXT_FIRST_INDEX (0) (depth 1)!
[732715.730084] Aborting journal on device sda1-8.
[732715.730269] EXT4-fs (sda1): Remounting filesystem read-only
[732715.730278] EXT4-fs error (device sda1) in ext4_da_writepages:3033: IO failure
[732715.730440] EXT4-fs (sda1): ext4_da_writepages: jbd2_start: 589 pages, ino 21374007; err -30
This hasn't happened in three months. The last time I saw it was the
last time I ran xwayland. While correlation does not imply causation,
and it *could* be a coincidence, I'm really not willing to entertain
that possibility as realistic.
This time I used RAOF's X DDX (and updated the xwayland instructions
for Radeon / ATI to use it). Last time I was using timon37's DDX. I don't
know if they share code. I don't know if they're at fault.
I was using the DRM backend. I ran "make install" as a non-root user, and
then set weston-launch as owned by root and +s, and ran weston-launch. I
did not have xserver set suid root.
I was playing a video on youtube in chromium vix xwayland when X crashed
(taking firefox, the only other X client I was running, out with it.)
And was chatting with folks in IRC about X's failure to respawn when I
realized my filesystem had been remounted readonly. Then dug the above
output out of dmesg.
I was working on updating my "state of wayland" page to say that wayland
was looking pretty usable now :/
fsck said lots of scary things after rebooting, I had to manually confirm
it wanted to do many of them. I have photos if anyone is interested
in details. Lots of "Free blocks count wrong for group... Fix(y)?" and
"Free inodes count wrong for group... Fix(y)?" A "Block bitmap
differences..."
I don't know for sure if I lost anything, but have not yet
seen evidence that I did. I have pretty good backups.
12:15 < pq> either xwayland triggers some fs bug, or triggers a gfx driver
bug, which then scribbles over kernel memory - or faulty hw. Can't know.
12:16 < soreau> either way, it's a fairly serious problem
I agree with this assessment.
So far, I think it has only affected the filesystem I was using at the time
(I basically only use one partition per linux install). So I may be
willing to do more testing on a dedicated testing partition.
This graphics card needed to go on ubuntu's grub gfxpayload blacklist,
because for some reason retaining the graphics mode from grub to X breaks
on some graphics cards, including this one. Seems unlikely to be directly
related, just trying to provide all possibly relevant info I have. The bug
for this was: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/971204
I was running an up to date ubuntu oneric install.
lspci output:
05:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Barts XT [Radeon HD 6800 Series] (prog-if 00 [VGA controller])
Subsystem: Hightech Information System Ltd. Device 2010
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 46
Region 0: Memory at d0000000 (64-bit, prefetchable) [size=256M]
Region 2: Memory at fbfc0000 (64-bit, non-prefetchable) [size=128K]
Region 4: I/O ports at e000 [size=256]
Expansion ROM at fbfa0000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: radeon
Kernel modules: radeon
I had the latest get masters of everything as of 2012-09-03 08:36 -0400.
weston commit 8538b22ff4ad8879b4e3288be053508167562859
wayland commit 2be6e0ed142bac669398a9ad26d336666fa53216
raof's xf86-video-ati xwayland branch commit 8dc07e63eaf8909f7046bf746a119ec749352441
On 05/30, darxus at chaosreigns.com wrote:
> I was just playing with weston master under X, and timon37's radeon ddx,
> weston crashed, when I tried to delete the wayland lock I got an error
> that the filesystem was readonly, dmesg said:
>
> [ 496.347230] EXT4-fs error (device sda1): ext4_ext_search_left:1275: inode #21374007: comm flush-8:0:ix (10742) != EXT_FIRST_INDEX (0) (depth 1)!
> [ 496.347236] Aborting journal on device sda1-8.
> [ 496.347383] EXT4-fs (sda1): Remounting filesystem read-only
>
> Which is pretty scary.
>
> I feel like this *might* have happened before when I was playing with
> weston, but I definitely don't have enough information to suggest there is
> any real correlation. Or knowledge of what exactly is going on here to
> know if a correlation is even possible.
>
> I had xserver set suid root, out of habit from before I fixed the bug
> complaining about not getting master, which seems like it might have made
> something like this possible.
>
> I do have good backups.
--
"If you would be a real seeker after truth, it is necessary that at
least once in your life you doubt, as far as possible, all things."
- Rene Descartes
http://www.ChaosReigns.com
More information about the wayland-devel
mailing list