Long radeon stalls on recent kernels

Michel Dänzer michel at daenzer.net
Tue Nov 25 22:42:44 PST 2014


On 20.11.2014 09:58, Andy Lutomirski wrote:
> On Wed, Nov 19, 2014 at 4:07 PM, Andy Lutomirski <luto at amacapital.net> wrote:
>> On Tue, Nov 18, 2014 at 11:19 PM, Michel Dänzer <michel at daenzer.net> wrote:
>>> On 19.11.2014 09:21, Andy Lutomirski wrote:
>>>>
>>>> On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer <michel at daenzer.net> wrote:
>>>>>
>>>>> On 15.11.2014 07:21, Andy Lutomirski wrote:
>>>>>>
>>>>>>
>>>>>> On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything
>>>>>> graphics intensive seems to cause my system to become unusable for
>>>>>> tens of seconds.  Pointing Firefox at Google Maps is a big offender --
>>>>>> it can take several minutes for me to move my mouse far enough to
>>>>>> close the tab and get my computer back.
>>>>>>
>>>>>> On bootup, I get this warning:
>>>>>> [drm:btc_dpm_set_power_state] *ERROR*
>>>>>> rv770_restrict_performance_levels_before_switch failed
>>>>>>
>>>>>> Setting radeon.dpm=0 seems to work around this problem at the cost of
>>>>>> giving my rather slow graphics.
>>>>>>
>>>>>> Are there known issues here?
>>>>>
>>>>>
>>>>>
>>>>> Can you bisect the kernel, or at least isolate which kernel version first
>>>>> introduced the problem?
>>>>
>>>>
>>>> With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15,
>>>> 3.16, and 3.18-rc4+.  I haven't tried other versions.
>>>>
>>>> With radeon.dpm=0, I can still trigger short stalls (around one
>>>> second), but I seem unable to trigger long stalls easily.  (I say
>>>> easily because, just as I was typing this email, my system stalled for
>>>> about a minute.)
>>>
>>>
>>> I can only think of two things offhand that could cause such extremely long
>>> stalls: Swap thrashing or IRQ storms.
>>>
>>> With a setup where you can easily trigger long stalls, can you try getting a
>>> CPU profile for a stall with sysprof or perf?
>>>
>>>
>>
>> Got one with perf:
>>
>>    16.82%             Xorg  libc-2.18.so                        [.]
>> __memcpy_sse2_unaligned
>>     9.20%          swapper  [kernel.kallsyms]                   [k] intel_idle
>>     1.00%             Xorg  [kernel.kallsyms]                   [k]
>> evergreen_irq_set
>>     0.83%          firefox  libxul.so                           [.]
>> 0x0000000001d93281
>>     0.69%          firefox  libxul.so                           [.]
>> 0x0000000001d932ad
>>     0.62%          firefox  [kernel.kallsyms]                   [k]
>> copy_user_generic_string
>>     0.55%          swapper  [kernel.kallsyms]                   [k]
>> evergreen_irq_ack
>>     0.54%          firefox  libpthread-2.18.so                  [.]
>> pthread_mutex_lock
>>     0.52%          firefox  libpthread-2.18.so                  [.]
>> pthread_mutex_unlock
>>     0.45%             Xorg  [kernel.kallsyms]                   [k]
>> drm_mm_insert_node_in_range_generic
>>     0.41%             Xorg  [kernel.kallsyms]                   [k] lock_release
>>     0.40%             Xorg  [kernel.kallsyms]                   [k] lock_acquire
>>     0.35%          firefox  firefox                             [.]
>> 0x000000000001245d
>>     0.33%             Xorg  [kernel.kallsyms]                   [k]
>> __module_address
>>     0.31%          firefox  [kernel.kallsyms]                   [k] clear_page_c
>>     0.29%             Xorg  [kernel.kallsyms]                   [k]
>> copy_user_generic_string
>>     0.28%          firefox  firefox                             [.]
>> 0x0000000000013159
>>
>> and:
>>
>> Samples: 11K of event 'irq:irq_handler_entry', Event count (approx.): 11802
>>    87.43%          swapper  [kernel.kallsyms]  [k] handle_irq_event_percpu
>>     7.52%          firefox  [kernel.kallsyms]  [k] handle_irq_event_percpu
>>     1.84%      irq/36-ahci  [kernel.kallsyms]  [k] handle_irq_event_percpu
>>     1.14%             Xorg  [kernel.kallsyms]  [k] handle_irq_event_percpu
>>     0.75%      kworker/5:0  [kernel.kallsyms]  [k] handle_irq_event_percpu
>>     0.32%      gnome-shell  [kernel.kallsyms]  [k] handle_irq_event_percpu
>>     0.25%     kworker/5:1H  [kernel.kallsyms]  [k] handle_irq_event_percpu
>>     0.25%  Media D~ode #10  [kernel.kallsyms]  [k] handle_irq_event_percpu
>>     0.19%  ImageDe~er #330  [kernel.kallsyms]  [k] handle_irq_event_percpu
>>     0.07%       pulseaudio  [kernel.kallsyms]  [k] handle_irq_event_percpu
>>
>> The cycles were with -e cycles:pp, so I think that iret would have
>> shown up if there were enough IRQs to cause the problem.
>>
>> I'll build a kernel with latencytop.
>>
>
> I just caught call_rwsem_down_write_failed for 5379 ms in khugepaged
> (holy crap) and radeon_fence_default_wait for 489.2ms in Xorg.
>
> Turning off THP gets rid of the khugepaged thing.  The 489.2ms is
> radeon_fence_default_wait is amazingly reproducible -- I've seen that
> exact number three times now.

Sounds like the long stalls were THP, but the shorter ones might be radeon?

Can you get some call graphs for the profile or from latencytop? Make 
sure at least the kernel is built with frame pointers 
(CONFIG_FRAME_POINTER=y), preferably also userspace 
(-fno-omit-frame-pointer).


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer


More information about the dri-devel mailing list