✗ CI.checkpatch: warning for series starting with [1/3] drm/xe/trace: improve xe_sched_msg trace

Patchwork patchwork at emeril.freedesktop.org
Fri Nov 22 16:28:01 UTC 2024


== Series Details ==

Series: series starting with [1/3] drm/xe/trace: improve xe_sched_msg trace
URL   : https://patchwork.freedesktop.org/series/141705/
State : warning

== Summary ==

+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
30ab6715fc09baee6cc14cb3c89ad8858688d474
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit 553114143726c4e3493b19c1d26e1bcdb53b2411
Author: Matthew Auld <matthew.auld at intel.com>
Date:   Fri Nov 22 16:19:17 2024 +0000

    drm/xe/guc_submit: fix race around suspend_pending
    
    Currently in some testcases we can trigger:
    
    xe 0000:03:00.0: [drm] Assertion `exec_queue_destroyed(q)` failed!
    ....
    WARNING: CPU: 18 PID: 2640 at drivers/gpu/drm/xe/xe_guc_submit.c:1826 xe_guc_sched_done_handler+0xa54/0xef0 [xe]
    xe 0000:03:00.0: [drm] *ERROR* GT1: DEREGISTER_DONE: Unexpected engine state 0x00a1, guc_id=57
    
    Looking at a snippet of corresponding ftrace for this GuC id we can see:
    
    162.673311: xe_sched_msg_add:     dev=0000:03:00.0, gt=1 guc_id=57, opcode=3
    162.673317: xe_sched_msg_recv:    dev=0000:03:00.0, gt=1 guc_id=57, opcode=3
    162.673319: xe_exec_queue_scheduling_disable: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0x29, flags=0x0
    162.674089: xe_exec_queue_kill:   dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0x29, flags=0x0
    162.674108: xe_exec_queue_close:  dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa9, flags=0x0
    162.674488: xe_exec_queue_scheduling_done: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa9, flags=0x0
    162.678452: xe_exec_queue_deregister: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa1, flags=0x0
    
    It looks like we try to suspend the queue (opcode=3), setting
    suspend_pending and triggering a disable_scheduling. The user then
    closes the queue. However closing the queue seems to forcefully signal
    the fence after killing the queue, however when the G2H response for
    disable_scheduling comes back we have now cleared suspend_pending when
    signalling the suspend fence, so the disable_scheduling now incorrectly
    tries to also deregister the queue, leading to warnings since the queue
    has yet to even be marked for destruction. We also seem to trigger
    errors later with trying to double unregister the same queue.
    
    To fix this tweak the ordering when handling the response to ensure we
    don't race with a disable_scheduling that doesn't actually intend to
    actually unregister.  The destruction path should now also correctly
    wait for any pending_disable before marking as destroyed.
    
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3371
    Signed-off-by: Matthew Auld <matthew.auld at intel.com>
    Cc: Matthew Brost <matthew.brost at intel.com>
    Cc: <stable at vger.kernel.org> # v6.8+
+ /mt/dim checkpatch a381faddbfc974e7bd57efe953a738415afccd6a drm-intel
1cbe4629b071 drm/xe/trace: improve xe_sched_msg trace
-:33: WARNING:LONG_LINE: line length of 116 exceeds 100 columns
#33: FILE: drivers/gpu/drm/xe/xe_trace.h:299:
+		    TP_printk("dev=%s, gt=%u guc_id=%d, opcode=%u", __get_str(dev), __entry->gt_id, __entry->guc_id,

total: 0 errors, 1 warnings, 0 checks, 19 lines checked
81e3dd175070 drm/xe/guc_submit: fix race around pending_disable
-:8: WARNING:COMMIT_LOG_LONG_LINE: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
#8: 
[drm] *ERROR* GT0: SCHED_DONE: Unexpected engine state 0x02b1, guc_id=8, runnable_state=0

-:83: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#83: FILE: drivers/gpu/drm/xe/xe_guc_submit.c:1337:
+		wait_event(guc->ct.wq, (q->guc->resume_time != RESUME_PENDING ||
+			   xe_guc_read_stopped(guc)) && !exec_queue_pending_disable(q));

total: 0 errors, 1 warnings, 1 checks, 37 lines checked
553114143726 drm/xe/guc_submit: fix race around suspend_pending
-:15: WARNING:COMMIT_LOG_LONG_LINE: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
#15: 
162.673311: xe_sched_msg_add:     dev=0000:03:00.0, gt=1 guc_id=57, opcode=3

total: 0 errors, 1 warnings, 0 checks, 32 lines checked




More information about the Intel-xe mailing list