linux-4.4 bisected: kwin5 stuck on kde5 loading screen with radeon
Mario Kleiner
mario.kleiner.de at gmail.com
Wed Jan 20 12:32:30 PST 2016
On 01/18/2016 11:49 AM, Vlastimil Babka wrote:
> On 01/16/2016 05:24 AM, Mario Kleiner wrote:
>>
>>
>> On 01/15/2016 01:26 PM, Ville Syrjälä wrote:
>>> On Fri, Jan 15, 2016 at 11:34:08AM +0100, Vlastimil Babka wrote:
>>
>> I'm currently running...
>>
>> while xinit /usr/bin/ksplashqml --test -- :1 ; do echo yay; done
>>
>> ... in an endless loop on Linux 4.4 SMP PREEMPT on HD-5770 and so far i
>> can't trigger a hang after hundreds of runs.
>>
>> Does this also hang for you?
>
> No, test mode seems to be fine.
>
>> I think a drm.debug=0x21 setting and grep'ping the syslog for "vblank"
>> should probably give useful info around the time of the hang.
>
> Attached. Captured by having kdm running, switching to console, running
> "dmesg -C ; dmesg -w > /tmp/dmesg", switch to kdm, enter password, see
> frozen splashscreen, switch back, terminate dmesg. So somewhere around
> the middle there should be where ksplashscreen starts...
>
>> Maybe also check XOrg.0.log for (WW) warnings related to flip.
>
> No such warnings there.
>
>> thanks,
>> -mario
>>
>>
>>>> Thanks,
>>>> Vlastimil
>>>
>
Thanks. So the problem is that AMDs hardware frame counters reset to
zero during a modeset. The old DRM code dealt with drivers doing that by
keeping vblank irqs enabled during modesets and incrementing vblank
count by one during each vblank irq, i think that's what
drm_vblank_pre_modeset() and drm_vblank_post_modeset() were meant for.
The new code in drm_update_vblank_count() breaks this. The reset of the
counter to zero is treated as counter wraparound, so our software vblank
counter jumps forward by up to 2^24 counts in response (in case of AMD's
24 bit hw counters), and then the vblank event handling code in
drm_handle_vblank_events() and other places detects the counter being
more than 2^23 counts ahead of queued vblank events and as part of its
own wraparound handling for the 32-Bit software counter doesn't deliver
these queued events for a long time -> no vblank swap trigger event ->
no swap -> client hangs waiting for swap completion.
I think i remember seeing the ksplash progress screen occasionally
blanking half way through login, i guess that's when kwin triggers a
modeset in parallel to ksplash doing its OpenGL animations. So depending
on the hw vblank count at the time of login ksplash would or wouldn't
hang, apparently i got "lucky" with my counts at login.
-mario
More information about the dri-devel
mailing list