[bug report] accel/habanalabs: enforce release order of compute device and dma-buf
Tomer Tayar
ttayar at habana.ai
Fri Jul 26 14:56:04 UTC 2024
On 26/07/2024 1:33, Dan Carpenter wrote:
On Thu, Jul 25, 2024 at 08:21:51AM +0000, Tomer Tayar wrote:
On 24/07/2024 19:08, Dan Carpenter wrote:
Hello Tomer Tayar,
Commit 09524eb8824e ("accel/habanalabs: enforce release order of
compute device and dma-buf") from Jan 22, 2023 (linux-next), leads to
the following Smatch static checker warning:
drivers/accel/habanalabs/common/memory.c:1844 hl_release_dmabuf()
error: dereferencing freed memory 'ctx' (line 1841)
drivers/accel/habanalabs/common/memory.c
1827 static void hl_release_dmabuf(struct dma_buf *dmabuf)
1828 {
1829 struct hl_dmabuf_priv *hl_dmabuf = dmabuf->priv;
1830 struct hl_ctx *ctx;
1831
1832 if (!hl_dmabuf)
1833 return;
1834
1835 ctx = hl_dmabuf->ctx;
1836
1837 if (hl_dmabuf->memhash_hnode)
1838 memhash_node_export_put(ctx, hl_dmabuf->memhash_hnode);
1839
1840 atomic_dec(&ctx->hdev->dmabuf_export_cnt);
1841 hl_ctx_put(ctx);
^^^
This will free ctx on the last reference
1842
1843 /* Paired with get_file() in export_dmabuf() */
--> 1844 fput(ctx->hpriv->file_priv->filp);
^^^
Potential use after free
Thanks for notifying us about this warning.
Actually, because of this commit, the call to hl_ctx_put() here cannot
be last.
The release of the device file has another reference decrement [
hl_device_release() -> hl_ctx_mgr_fini() ], and this change prevents
that release as long as a dma-buf object is alive.
Thanks for looking at this. To be honest, I'm just going to take this
on trust. ;) I looked at the code but refcounting is always a bit
tricky.
However, I will revise the function to get a pointer to
'ctx->hpriv->file_priv->filp' before calling hl_ctx_put(), so we won't
have the warning.
Please, don't do things just to make the static checker happy. These
refcounted use after free warnings are prone to false positives. I try
to do some sanity checking but I'm not a domain expert in this subsystem.
So, you know, just look at the warning and ignore it if it's wrong.
These warnings are a one time email. Everyone who works on the kernel
is really good about fixing bugs so we assume all old warnings are false
positives. Plus if they have questions they can search lore for this
email thread.
regards,
dan carpenter
Okay, no problem, I wont change the code.
Unless you are against it, I think I can still add a short comment before calling fput(), explaining why it is okay to access 'ctx' at that point. Thanks, Tomer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20240726/35034324/attachment.htm>
More information about the dri-devel
mailing list