multi-card breakage
Pauli Nieminen
suokkos at gmail.com
Tue May 4 22:33:47 PDT 2010
On Tue, May 4, 2010 at 11:21 PM, Pierre-Loup A. Griffais
<pgriffais at nvidia.com> wrote:
> Tiago,
>
> I just reproduced something that sounds like what you're describing with two
> R520 cards (one X screen per card) and the 'radeon' driver. However, it
> seems unrelated to my change; that's what the hang looks like:
>
> 575 VGAGet();
> (gdb) bt
> #0 VGAarbiterCreateGC (pGC=0x83ebab0)
> at ../../../../hw/xfree86/common/xf86VGAarbiter.c:575
> #1 0x080777ba in CreateGC (pDrawable=0x82d8d78, mask=<value optimized out>,
> pval=0xbffff534, pStatus=0xbffff53c, gcid=0, client=0x81ffca8)
> at ../../dix/gc.c:647
> #2 0x0819e612 in miDCMakeGC (pWin=0x82b5530) at ../../mi/midispcur.c:422
> #3 0x0819e7c4 in miDCDeviceInitialize (pDev=0x83ebdf0, pScreen=0x8263688)
> at ../../mi/midispcur.c:790
> #4 0x081c48cf in miSpriteDeviceCursorInitialize (pDev=0x83ebdf0,
> pScreen=0x8263688) at ../../mi/misprite.c:949
> #5 0x08186364 in xf86DeviceCursorInitialize (pDev=0x83ebdf0,
> pScreen=0x8263688) at ../../../../hw/xfree86/ramdac/xf86Cursor.c:453
> #6 0x081672ba in VGAarbiterDeviceCursorInitialize (pDev=0x83ebdf0,
> pScreen=0x8263688) at ../../../../hw/xfree86/common/xf86VGAarbiter.c:1035
> #7 0x080a1e0c in miPointerDeviceInitialize (pDev=0x83ebdf0,
> pScreen=0x8263688)
> at ../../mi/mipointer.c:283
> #8 0x08087ed5 in ActivateDevice (dev=0x83ebdf0, sendevent=1 '\001')
> at ../../dix/devices.c:477
> #9 0x08088f08 in InitCoreDevices () at ../../dix/devices.c:610
> #10 0x08066d18 in main (argc=1, argv=0xbffff8a4, envp=0xbffff8ac)
> at ../../dix/main.c:255
>
> The reason my change exposes this bug is that it creates a GC attached to
> the second screen upfront. If I roll it back, I still get the same hang
> after trying to move a SW cursor to the second screen of connecting an X
> client to the second screen. Looking at the X log, I see:
>
> (II) RADEON(1): PCIE card detected
> (II) Loading sub module "int10"
> (II) LoadModule: "int10"
> (II) Reloading /usr/lib/xorg/modules/libint10.so
> (II) RADEON(1): initializing int10
> (EE) RADEON(1): Cannot read V_BIOS (3) Input/output error
> (WW) RADEON(1): Failed to read PCI ROM!
> (II) RADEON(1): Attempting to read un-POSTed bios
>
Secundary card is not posted by BIOS so driver has to handle initialization.
This is radeon bug which might be fixed already. I remember similar
bug report a few months ago. As a workaround you could try to switch
primary card in BIOS if the driver for the current primary card knows
how to initialize unposted card.
> and in the kernel log:
>
> [ 1240.582149] pci 0000:05:00.0: Invalid ROM contents
>
> That means the VGA arbiter tried to switch VGA access to an un-posted
> device, which is presumably the cause of the hang. It seems like the X
> screen should fail ScreenInit() and get discarded after initializing int10
> fails. Whatever the reason behind that is, the driver ought to fail more
> gracefully.
>
> In any case, I'm guessing you have similar spew in your logs?
>
> Thanks,
> - Pierre-Loup
>
> On 05/04/2010 10:28 AM, Pierre-Loup A. Griffais wrote:
>>
>> Tiago,
>>
>> This commit fixes the SW cursor with several screens, so the "multi-card
>> case"
>> is more or less the point of the change. Can you be more specific about
>> the
>> problems you're having and your testing environment? Do you have the SW
>> cursor
>> forced on?
>>
>> Removing the dependency on devPrivates from this layer is way outside the
>> scope
>> of this patch.
>>
>> Thanks,
>> - Pierre-Loup
>>
>> On 05/04/2010 07:10 AM, Tiago Vignatti wrote:
>>>
>>> Pierre and Peter,
>>>
>>>
>>> commit 518f3b189b6c8aa28b62837d14309fd06163ccbb
>>> Author: Pierre-Loup A. Griffais<pgriffais at nvidia.com>
>>> Date: Wed Apr 21 16:46:17 2010 -0700
>>>
>>> mi: don't thrash resources when displaying the software cursor
>>> across screens
>>>
>>>
>>> This commit break my system in a very bizarre way that I cannot kill the
>>> X server, neither use it anymore. I can reproduce it only when I set two
>>> video
>>> cards by the server.
>>>
>>> I haven't checked the logic behind the commit to see exactly what's going
>>> wrong. But I'd guess maybe your forgetting multi-card case? Also, we
>>> don't
>>> want privates mechanism in these common layers, so you could cook a
>>> commit
>>> without it.
>>>
>>>
>>> Cheers,
>>>
>>> Tiago
>>
More information about the xorg-devel
mailing list