[bug][vaapi][h264] The commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 on certain video files leads to problems with VAAPI hardware decoding.

Mikhail Gavrilov mikhail.v.gavrilov at gmail.com
Fri Feb 17 06:09:59 UTC 2023


On Fri, Dec 9, 2022 at 7:37 PM Leo Liu <leo.liu at amd.com> wrote:
>
> Please try the latest AMDGPU driver:
>
> https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next/
>

Sorry Leo, I miss your message.
This issue is still actual for 6.2-rc8.

In my first message I was mistaken.

> Before kernel 5.16 this only led to an artifact in the form of
> a green bar at the top of the screen, then starting from 5.17
> the GPU began to freeze.

The real behaviour before 5.18:
- vlc could plays video with small artifacts in the form of a green
bar on top of the video
- after playing video process vlc correctly exiting

On 5.18 this behaviour changed:
- vlc show black screen instead of playing video
- after playing the process not exiting
- if I tries kill vlc process with 'kill -9' vlc became zombi process
and many other processes start hangs (in kernel log appears follow
lines after 2 minutes)

INFO: task vlc:sh8:5248 blocked for more than 122 seconds.
      Tainted: G        W    L   --------  ---  5.18.0-60.fc37.x86_64+debug #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:vlc:sh8         state:D stack:13616 pid: 5248 ppid:  1934 flags:0x00004006
Call Trace:
 <TASK>
 __schedule+0x492/0x1650
 ? _raw_spin_unlock_irqrestore+0x40/0x60
 ? debug_check_no_obj_freed+0x12d/0x250
 schedule+0x4e/0xb0
 schedule_timeout+0xe1/0x120
 ? lock_release+0x215/0x460
 ? trace_hardirqs_on+0x1a/0xf0
 ? _raw_spin_unlock_irqrestore+0x40/0x60
 dma_fence_default_wait+0x197/0x240
 ? __bpf_trace_dma_fence+0x10/0x10
 dma_fence_wait_timeout+0x229/0x260
 drm_sched_entity_fini+0x101/0x270 [gpu_sched]
 amdgpu_vm_fini+0x2b5/0x460 [amdgpu]
 ? idr_destroy+0x70/0xb0
 ? mutex_destroy+0x1e/0x50
 amdgpu_driver_postclose_kms+0x1ec/0x2c0 [amdgpu]
 drm_file_free.part.0+0x20d/0x260
 drm_release+0x6a/0x120
 __fput+0xab/0x270
 task_work_run+0x5c/0xa0
 do_exit+0x394/0xc40
 ? rcu_read_lock_sched_held+0x10/0x70
 do_group_exit+0x33/0xb0
 get_signal+0xbbc/0xbc0
 arch_do_signal_or_restart+0x30/0x770
 ? do_futex+0xfd/0x190
 ? __x64_sys_futex+0x63/0x190
 exit_to_user_mode_prepare+0x172/0x270
 syscall_exit_to_user_mode+0x16/0x50
 do_syscall_64+0x67/0x80
 ? do_syscall_64+0x67/0x80
 ? rcu_read_lock_sched_held+0x10/0x70
 ? trace_hardirqs_on_prepare+0x5e/0x110
 ? do_syscall_64+0x67/0x80
 ? rcu_read_lock_sched_held+0x10/0x70
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f82c2364529
RSP: 002b:00007f8210ff8c00 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f82c2364529
RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00007f823022542c
RBP: 00007f8210ff8c30 R08: 0000000000000000 R09: 00000000ffffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000001 R15: 00007f823022542c
 </TASK>
INFO: lockdep is turned off.

I bisected this issue and problematic commit is

❯ git bisect bad
5f3854f1f4e211f494018160b348a1c16e58013f is the first bad commit
commit 5f3854f1f4e211f494018160b348a1c16e58013f
Author: Alex Deucher <alexander.deucher at amd.com>
Date:   Thu Mar 24 18:04:00 2022 -0400

    drm/amdgpu: add more cases to noretry=1

    Port current list from amd-staging-drm-next.

    Signed-off-by: Alex Deucher <alexander.deucher at amd.com>

 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 +++
 1 file changed, 3 insertions(+)

Unfortunately I couldn't simply revert this commit on 6.2-rc8 for
checking, because it leads to conflicts.

Alex, you as author of this commit could help me with it?


-- 
Best Regards,
Mike Gavrilov.


More information about the amd-gfx mailing list