[PATCH] drm/v3d: Avoid NULL pointer dereference in `v3d_job_update_stats()`
Chema Casanova
jmcasanova at igalia.com
Thu Jun 5 09:27:47 UTC 2025
This patch is:
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova at igalia.com>
But I have a question. Could we see this scenario of the process finishing
before other access to job->file->driver_priv in other parts of the v3d
driver?
Regards,
Chema
El 2/6/25 a las 17:14, Maíra Canal escribió:
> The following kernel Oops was recently reported by Mesa CI:
>
> [ 800.139824] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000588
> [ 800.148619] Mem abort info:
> [ 800.151402] ESR = 0x0000000096000005
> [ 800.155141] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 800.160444] SET = 0, FnV = 0
> [ 800.163488] EA = 0, S1PTW = 0
> [ 800.166619] FSC = 0x05: level 1 translation fault
> [ 800.171487] Data abort info:
> [ 800.174357] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
> [ 800.179832] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 800.184873] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 800.190176] user pgtable: 4k pages, 39-bit VAs, pgdp=00000001014c2000
> [ 800.196607] [0000000000000588] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
> [ 800.205305] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> [ 800.211564] Modules linked in: vc4 snd_soc_hdmi_codec drm_display_helper v3d cec gpu_sched drm_dma_helper drm_shmem_helper drm_kms_helper drm drm_panel_orientation_quirks snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm i2c_brcmstb snd_timer snd backlight
> [ 800.234448] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.25+rpt-rpi-v8 #1 Debian 1:6.12.25-1+rpt1
> [ 800.244182] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
> [ 800.250005] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 800.256959] pc : v3d_job_update_stats+0x60/0x130 [v3d]
> [ 800.262112] lr : v3d_job_update_stats+0x48/0x130 [v3d]
> [ 800.267251] sp : ffffffc080003e60
> [ 800.270555] x29: ffffffc080003e60 x28: ffffffd842784980 x27: 0224012000000000
> [ 800.277687] x26: ffffffd84277f630 x25: ffffff81012fd800 x24: 0000000000000020
> [ 800.284818] x23: ffffff8040238b08 x22: 0000000000000570 x21: 0000000000000158
> [ 800.291948] x20: 0000000000000000 x19: ffffff8040238000 x18: 0000000000000000
> [ 800.299078] x17: ffffffa8c1bd2000 x16: ffffffc080000000 x15: 0000000000000000
> [ 800.306208] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> [ 800.313338] x11: 0000000000000040 x10: 0000000000001a40 x9 : ffffffd83b39757c
> [ 800.320468] x8 : ffffffd842786420 x7 : 7fffffffffffffff x6 : 0000000000ef32b0
> [ 800.327598] x5 : 00ffffffffffffff x4 : 0000000000000015 x3 : ffffffd842784980
> [ 800.334728] x2 : 0000000000000004 x1 : 0000000000010002 x0 : 000000ba4c0ca382
> [ 800.341859] Call trace:
> [ 800.344294] v3d_job_update_stats+0x60/0x130 [v3d]
> [ 800.349086] v3d_irq+0x124/0x2e0 [v3d]
> [ 800.352835] __handle_irq_event_percpu+0x58/0x218
> [ 800.357539] handle_irq_event+0x54/0xb8
> [ 800.361369] handle_fasteoi_irq+0xac/0x240
> [ 800.365458] handle_irq_desc+0x48/0x68
> [ 800.369200] generic_handle_domain_irq+0x24/0x38
> [ 800.373810] gic_handle_irq+0x48/0xd8
> [ 800.377464] call_on_irq_stack+0x24/0x58
> [ 800.381379] do_interrupt_handler+0x88/0x98
> [ 800.385554] el1_interrupt+0x34/0x68
> [ 800.389123] el1h_64_irq_handler+0x18/0x28
> [ 800.393211] el1h_64_irq+0x64/0x68
> [ 800.396603] default_idle_call+0x3c/0x168
> [ 800.400606] do_idle+0x1fc/0x230
> [ 800.403827] cpu_startup_entry+0x40/0x50
> [ 800.407742] rest_init+0xe4/0xf0
> [ 800.410962] start_kernel+0x5e8/0x790
> [ 800.414616] __primary_switched+0x80/0x90
> [ 800.418622] Code: 8b170277 8b160296 11000421 b9000861 (b9401ac1)
> [ 800.424707] ---[ end trace 0000000000000000 ]---
> [ 800.457313] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
>
> This issue happens when the file descriptor is closed before the jobs
> submitted by it are completed. When the job completes, we update the
> global GPU stats and the per-fd GPU stats, which are exposed through
> fdinfo. If the file descriptor was closed, then the struct `v3d_file_priv`
> and its stats were already freed and we can't update the per-fd stats.
>
> Therefore, if the file descriptor was already closed, don't update the
> per-fd GPU stats, only update the global ones.
>
> Cc: stable at vger.kernel.org # v6.12+
> Signed-off-by: Maíra Canal <mcanal at igalia.com>
> ---
> drivers/gpu/drm/v3d/v3d_sched.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index 466d28ceee28..5ed676304964 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> @@ -199,7 +199,6 @@ v3d_job_update_stats(struct v3d_job *job, enum v3d_queue queue)
> struct v3d_dev *v3d = job->v3d;
> struct v3d_file_priv *file = job->file->driver_priv;
> struct v3d_stats *global_stats = &v3d->queue[queue].stats;
> - struct v3d_stats *local_stats = &file->stats[queue];
> u64 now = local_clock();
> unsigned long flags;
>
> @@ -209,7 +208,12 @@ v3d_job_update_stats(struct v3d_job *job, enum v3d_queue queue)
> else
> preempt_disable();
>
> - v3d_stats_update(local_stats, now);
> + /* Don't update the local stats if the file context has already closed */
> + if (file)
> + v3d_stats_update(&file->stats[queue], now);
> + else
> + drm_dbg(&v3d->drm, "The file descriptor was closed before job completion\n");
> +
> v3d_stats_update(global_stats, now);
>
> if (IS_ENABLED(CONFIG_LOCKDEP))
More information about the dri-devel
mailing list