mieqProcessDeviceEvent make calls to mieqEnque with signals enabled - freezes Xorg server - multi-screen
Donald Kayser
xorg at kayser.net
Wed May 25 14:59:47 PDT 2011
I have moved to a new platform, AMD 64bit with 3 ATI Radeon HD 2400
video devices, debian wheezy that has xorg-xserver 1.10. I just had
the following crash that looks just like the stack trace on PPC and
1.7 Xorg server reported here earlier. Crash occurred while dragging
mouse from one device to the next. Does anyone know at first glance if
these are related?
: /usr/bin/X (xorg_backtrace+0x28) [0x4a38b8]
1: /usr/bin/X (0x400000+0x646e9) [0x4646e9]
2: /lib/libpthread.so.0 (0x7f0441967000+0xf020) [0x7f0441976020]
3: /usr/lib/xorg/modules/drivers/fglrx_drv.so
(0x7f043da89000+0x8ec2bf) [0x7f043e3752bf]
4: /usr/lib/xorg/modules/drivers/fglrx_drv.so
(0x7f043da89000+0x8ebe1d) [0x7f043e374e1d]
5: /usr/bin/X (0x400000+0x1417a2) [0x5417a2]
6: /usr/bin/X (0x400000+0x1418f1) [0x5418f1]
7: /usr/bin/X (0x400000+0x140340) [0x540340]
8: /usr/bin/X (miPointerUpdateSprite+0x18c) [0x45cbcc]
9: /usr/bin/X (mieqProcessInputEvents+0x1a9) [0x4a3299]
10: /usr/bin/X (ProcessInputEvents+0x9) [0x46da39]
11: /usr/bin/X (0x400000+0x314a3) [0x4314a3]
12: /usr/bin/X (0x400000+0x257de) [0x4257de]
13: /lib/libc.so.6 (__libc_start_main+0xfd) [0x7f04406a0ead]
14: /usr/bin/X (0x400000+0x25389) [0x425389]
ii xserver-xorg-core 2:1.10.1-2
Xorg X server - core server
On May 18, 2011, at 5:09 PM, Donald Kayser wrote:
> diff of changes on my build that get around the issue. Basically, I
> now throw away events when in mieqEnqueue if the event is from and
> old device and NewCurrentSession() has been called from
> mieqProcessDeviceEvent(). The logs show that we throw away about two
> to three events whenever the mouse is moving from screen to screen.
> This is less of an issue for my purposes than the freeze. We will be
> testing this here to see if it causes any problems that we can detect.
>
> diff original/xorg-server-1.7.7/mi/mieq.c modified/xorg/server/xorg-
> server-1.7.7/mi/mieq.c
> 60a61,63
> > static int miInMieqProcessDeviceEvent = 0;
> > static DeviceIntPtr miInMieqDevice = 0;
> >
> 158a162,168
> > // return if event is from old device and in call to
> NewCurrentScreen() from mieqProcessDeviceEvent()
> > if ( miInMieqProcessDeviceEvent && ( pDev == miInMieqDevice ) )
> > {
> > ErrorF("[mi] Screen changing, tossing event from old
> device \n");
> > return;
> > }
> >
> 390a401,402
> > miInMieqDevice = dev;
> > miInMieqProcessDeviceEvent = 1;
> 394a407
> > miInMieqProcessDeviceEvent = 0;
> On May 18, 2011, at 2:22 PM, Donald Kayser wrote:
>
>> Here is a stack trace of an event that is now being ignored and I
>> suspect is revealing why the signal approach didn't work and maybe
>> will give you insight:
>>
>> 0 mieqEnqueue (pDev=0x10462d80, e=0x101f78e8) at ../../mi/mieq.c:162
>> #1 0x100715dc in xf86PostMotionEventP (device=0x10462d80,
>> is_absolute=<value optimized out>,
>> first_valuator=<value optimized out>, num_valuators=<value
>> optimized out>, valuators=<value optimized out>)
>> at ../../../../hw/xfree86/common/xf86Xinput.c:982
>> #2 0x0f6b50f0 in ?? () from /usr/lib/xorg/modules/input/evdev_drv.so
>> #3 0x0f6b54c4 in ?? () from /usr/lib/xorg/modules/input/evdev_drv.so
>> #4 0x1006db08 in xf86SigioReadInput (fd=<value optimized out>,
>> closure=<value optimized out>)
>> at ../../../../hw/xfree86/common/xf86Events.c:313
>> #5 0x10135f28 in xf86SIGIO (sig=<value optimized out>)
>> at ../../../../../hw/xfree86/os-support/linux/../shared/sigio.c:118
>> #6 <signal handler called>
>> #7 0x0fb02ba0 in sigprocmask () from /lib/libc.so.6
>> #8 0x10135834 in xf86UnblockSIGIO (wasset=<value optimized out>)
>> at ../../../../../hw/xfree86/os-support/linux/../shared/sigio.c:
>> 297
>> #9 0x1006fd78 in xf86WarpCursor (pDev=0x10462d80,
>> pScreen=0x10249b48, x=<value optimized out>, y=<value optimized out>)
>> at ../../../../hw/xfree86/common/xf86Cursor.c:476
>> #10 0x1005cec8 in miPointerSetCursorPosition (pDev=0x10462d80,
>> pScreen=0x10249b48, x=<value optimized out>,
>> y=<value optimized out>, generateEvent=1) at ../../mi/
>> mipointer.c:239
>> #11 0x100ecdac in AnimCurSetCursorPosition (pDev=0x10462d80,
>> pScreen=0x10249b48, x=<value optimized out>,
>> y=<value optimized out>, generateEvent=<value optimized out>)
>> at ../../render/animcur.c:266
>> #12 0x1001f738 in CheckPhysLimits (pDev=0x10462d80, cursor=<value
>> optimized out>, generateEvents=1, confineToScreen=0,
>> pScreen=0x10249b48) at ../../dix/events.c:756
>> #13 0x100acf60 in mieqProcessDeviceEvent (dev=0x10462d80,
>> event=0x105091c0, screen=0x10249b48) at ../../mi/mieq.c:402
>> #14 0x100ad020 in mieqProcessInputEvents () at ../../mi/mieq.c:489
>> #15 0x1006dd54 in ProcessInputEvents () at ../../../../hw/xfree86/
>> common/xf86Events.c:165
>> #16 0x10050d1c in Dispatch () at ../../dix/dispatch.c:371
>> #17 0x1001d0d4 in main (argc=5, argv=0xbffffdc4, envp=<value
>> optimized out>) at ../../dix/main.c:283
>> On May 18, 2011, at 2:13 PM, Donald Kayser wrote:
>>
>>> More notes of interest. The use of OsBlockSignals() and
>>> OsReleaseSignals() does not solve the problem. I also attempted
>>> using sigaction() to set the handler of SIGIO to SIG_IGN and this
>>> does not solve the problem. The only kludge, hack that works is
>>> setting a static variable in mieq.c that I added to indicate that
>>> mieqProcessDeviceEvent is being called, and in mieqEnqueue I check
>>> the value of the static variable and toss events if true. I have
>>> narrowed the "protected" code to the part surrounding the call to
>>> NewCurrentScreen() in mieqProcessDeviceEvent() in the top half of
>>> the function.
>>>
>>> I have not yet tried the newer version of Xorg; this is the more
>>> painful step in terms of time to solve this problem. I would have
>>> to repeat an entire product test regression if this becomes the
>>> solution.
>>>
>>> Any more ideas are very welcome.
>>>
>>> Thanks,
>>> Donald
>>>
>>> On May 18, 2011, at 10:24 AM, Donald Kayser wrote:
>>>
>>>> I'm glad to know I'm not alone on this one. I can reproduce it at
>>>> will with this embedded application and target. FYI, I will give
>>>> you some details of the system. It is an embedded controller with
>>>> PPC processor. We are running Linux 2.6.26 PREEMPT, debian
>>>> squeeze distribution, Xorg 1.7.7. I ported the 2.6.26 kernel to
>>>> load on this target. The video is two embedded C&T69030 graphics
>>>> chips; I re-wrote the xf86-video-chips driver to support 4
>>>> screens. We do not use Xinerama. Our application is QT based and
>>>> we use fluxbox as a window manager.
>>>>
>>>> To reproduce the problem here involves running the embedded
>>>> application. On one screen supported by one of the embedded chips
>>>> is a window that is being dragged upon and is consuming large
>>>> amounts of cpu time. Overlaying another screen on the second
>>>> embedded video chip is a touchscreen. As our application gets to
>>>> the state where it is consuming most of the cpu by dragging on
>>>> the first screen and one touches the second screen, this bug re-
>>>> appears 100% of the time, or nearly enough. I have not had the
>>>> time or platform ready to test on non PPC platform, but that is
>>>> not out of my realm since we do have target systems running
>>>> Intel. I have downloaded the source for the Xorg server, built
>>>> it, and have been debugging it to get to this point. I will
>>>> provide detailed stack traces and will narrow it down as much as
>>>> I can.
>>>>
>>>> As mentioned before, I have been able to work around it, but
>>>> would like a better solution. I will use the OsxxxxSignals()
>>>> calls to narrow down the exact time, but I suspect it is in the
>>>> call of NewCurrentScreen within the mieqProcessDeviceEvent()
>>>> function.
>>>>
>>>> Regards,
>>>> Donald
>>>>
>>>>
>>>> On May 17, 2011, at 5:46 PM, Peter Hutterer wrote:
>>>>
>>>>> On Tue, May 17, 2011 at 05:13:37PM -0500, Donald Kayser wrote:
>>>>>> Thanks for the quick response Jeremy. I was aware that I would
>>>>>> miss
>>>>>> events during this test, but that was better than freezing. I
>>>>>> have
>>>>>> not tried 1.10.x, but I will. We are trying to release a product
>>>>>> soon and changing to a new server and distribution is not
>>>>>> straightforward or the best move on our part. I might have to
>>>>>> consider any other solution for the short term. I am glad to hear
>>>>>> that we are not the only ones to have this problem and that it
>>>>>> might
>>>>>> already be solved. I will look further at 1.10.x and go from
>>>>>> there.
>>>>>
>>>>> I think this bug may still be there (possibly in a different
>>>>> incarnation) in
>>>>> 1.10. I haven't had any success reproducing it yet though.
>>>>> I know at least one of these got fixed in the last couple of
>>>>> server
>>>>> versions, but I can't seem to find the commit for it now.
>>>>>
>>>>> I suppose the quickest fix is to put OsBlockSignals() and
>>>>> OsReleaseSignals()
>>>>> around the part that must not be interrupted and rewrite it to
>>>>> be as short
>>>>> as possible. If you have a good description of the bug I'd love
>>>>> to hear it
>>>>> so we can look at a proper fix.
>>>>>
>>>>> Cheers,
>>>>> Peter
>>>>>
>>>>>> On May 17, 2011, at 4:49 PM, Jeremy Huddleston wrote:
>>>>>>
>>>>>>> Ignoring SIGIO will just result in dropped events. I seem to
>>>>>>> vaguely recall that this issue was addressed at some point in
>>>>>>> the
>>>>>>> past year or two since 1.7.x was active. Have you tried
>>>>>>> 1.10.x or
>>>>>>> master?
>>>>>>>
>>>>>>> On May 17, 2011, at 13:34, Donald Kayser wrote:
>>>>>>>
>>>>>>>> I am developing a system that include's the debian/squeeze
>>>>>>>> distribution of xorg-server, version 1.7.7. I have come
>>>>>>>> across a
>>>>>>>> scenario where mouse movements on one screen and a touch on
>>>>>>>> another screen will cause the Xorg process to freeze in an
>>>>>>>> infinite loop in the function mieqProcessInputEvents(). I have
>>>>>>>> traced the problem down to a small window during which a call
>>>>>>>> to
>>>>>>>> mieqProcessDeviceEvent can be interrupted by a signal and mess
>>>>>>>> up the miEventQueue.head and tail. It appears that in some
>>>>>>>> place
>>>>>>>> in this stack a new event is being enqueued while the screen is
>>>>>>>> changing and device messages get swapped to the wrong screen
>>>>>>>> and
>>>>>>>> back and forth.
>>>>>>>>
>>>>>>>> I put a global variable in mieqProcessDeviceEvent to indicate
>>>>>>>> to
>>>>>>>> mieqEnqueue to ignore data until finished. This has solved the
>>>>>>>> problem as a test. I am now writing the code to ignore the
>>>>>>>> SIGIO
>>>>>>>> signal during mieqProcessDeviceEvent and test this approach
>>>>>>>> also.
>>>>>>>>
>>>>>>>> Does anyone have a similar problem or advice?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Donald Kayser
>>>>>>>> xorg at kayser dot net
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> xorg at lists.freedesktop.org: X.Org support
>>>>>>>> Archives: http://lists.freedesktop.org/archives/xorg
>>>>>>>> Info: http://lists.freedesktop.org/mailman/listinfo/xorg
>>>>>>>> Your subscription address: jeremyhu at freedesktop.org
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> xorg at lists.freedesktop.org: X.Org support
>>>>>>> Archives: http://lists.freedesktop.org/archives/xorg
>>>>>>> Info: http://lists.freedesktop.org/mailman/listinfo/xorg
>>>>>>> Your subscription address: xorg at kayser.net
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> xorg at lists.freedesktop.org: X.Org support
>>>>>> Archives: http://lists.freedesktop.org/archives/xorg
>>>>>> Info: http://lists.freedesktop.org/mailman/listinfo/xorg
>>>>>> Your subscription address: peter.hutterer at who-t.net
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> xorg at lists.freedesktop.org: X.Org support
>>>> Archives: http://lists.freedesktop.org/archives/xorg
>>>> Info: http://lists.freedesktop.org/mailman/listinfo/xorg
>>>> Your subscription address: xorg at kayser.net
>>>>
>>>
>>> _______________________________________________
>>> xorg at lists.freedesktop.org: X.Org support
>>> Archives: http://lists.freedesktop.org/archives/xorg
>>> Info: http://lists.freedesktop.org/mailman/listinfo/xorg
>>> Your subscription address: xorg at kayser.net
>>>
>>
>> _______________________________________________
>> xorg at lists.freedesktop.org: X.Org support
>> Archives: http://lists.freedesktop.org/archives/xorg
>> Info: http://lists.freedesktop.org/mailman/listinfo/xorg
>> Your subscription address: xorg at kayser.net
>
> _______________________________________________
> xorg at lists.freedesktop.org: X.Org support
> Archives: http://lists.freedesktop.org/archives/xorg
> Info: http://lists.freedesktop.org/mailman/listinfo/xorg
> Your subscription address: xorg at kayser.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.x.org/archives/xorg/attachments/20110525/d83104de/attachment.html>
More information about the xorg
mailing list