[bug report] accel/habanalabs: enforce release order of compute device and dma-buf

Tomer Tayar ttayar at habana.ai
Fri Jul 26 14:56:04 UTC 2024


On 26/07/2024 1:33, Dan Carpenter wrote:

On Thu, Jul 25, 2024 at 08:21:51AM +0000, Tomer Tayar wrote:


On 24/07/2024 19:08, Dan Carpenter wrote:


Hello Tomer Tayar,

Commit 09524eb8824e ("accel/habanalabs: enforce release order of
compute device and dma-buf") from Jan 22, 2023 (linux-next), leads to
the following Smatch static checker warning:

        drivers/accel/habanalabs/common/memory.c:1844 hl_release_dmabuf()
        error: dereferencing freed memory 'ctx' (line 1841)

drivers/accel/habanalabs/common/memory.c
     1827 static void hl_release_dmabuf(struct dma_buf *dmabuf)
     1828 {
     1829         struct hl_dmabuf_priv *hl_dmabuf = dmabuf->priv;
     1830         struct hl_ctx *ctx;
     1831
     1832         if (!hl_dmabuf)
     1833                 return;
     1834
     1835         ctx = hl_dmabuf->ctx;
     1836
     1837         if (hl_dmabuf->memhash_hnode)
     1838                 memhash_node_export_put(ctx, hl_dmabuf->memhash_hnode);
     1839
     1840         atomic_dec(&ctx->hdev->dmabuf_export_cnt);
     1841         hl_ctx_put(ctx);
                             ^^^
This will free ctx on the last reference

     1842
     1843         /* Paired with get_file() in export_dmabuf() */
--> 1844         fput(ctx->hpriv->file_priv->filp);
                       ^^^
Potential use after free



Thanks for notifying us about this warning.

Actually, because of this commit, the call to hl_ctx_put() here cannot
be last.
The release of the device file has another reference decrement [
hl_device_release() -> hl_ctx_mgr_fini() ], and this change prevents
that release as long as a dma-buf object is alive.



Thanks for looking at this.  To be honest, I'm just going to take this
on trust.  ;)  I looked at the code but refcounting is always a bit
tricky.




However, I will revise the function to get a pointer to
'ctx->hpriv->file_priv->filp' before calling hl_ctx_put(), so we won't
have the warning.



Please, don't do things just to make the static checker happy.  These
refcounted use after free warnings are prone to false positives.  I try
to do some sanity checking but I'm not a domain expert in this subsystem.
So, you know, just look at the warning and ignore it if it's wrong.

These warnings are a one time email.  Everyone who works on the kernel
is really good about fixing bugs so we assume all old warnings are false
positives.  Plus if they have questions they can search lore for this
email thread.

regards,
dan carpenter

Okay, no problem, I wont change the code.

Unless you are against it, I think I can still add a short comment before calling fput(), explaining why it is okay to access 'ctx' at that point. Thanks, Tomer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20240726/35034324/attachment.htm>


More information about the dri-devel mailing list