[PATCH] drm/i915/display: Change ret value type from int to long

Dan Carpenter dan.carpenter at linaro.org
Mon Jul 14 17:06:49 UTC 2025


On Fri, Jul 04, 2025 at 03:00:55PM +0300, Jani Nikula wrote:
> On Fri, 04 Jul 2025, Aakash Deep Sarkar <aakash.deep.sarkar at intel.com> wrote:
> > dma_fence_wait_timeout returns a long type but the driver is
> > only using the lower 32 bits of the retval and discarding the
> > upper 32 bits.
> >
> > This is particularly problematic if there are already signalled
> > or stub fences on some of the hw planes. In this case the
> > dma_fence_wait_timeout function will immediately return with
> > timeout value MAX_SCHEDULE_TIMEOUT (0x7fffffffffffffff) since
> > the fence is already signalled. If the driver only uses the lower
> > 32 bits of this return value then it'll interpret it as an error
> > code (0xFFFFFFFF or (-1)) and skip the wait on the remaining fences.
> >
> > This issue was first observed with the Android compositor where
> > the GPU composited layer was not properly waited on when there
> > were stub fences in other overlay planes resulting in significant
> > visual artifacts.
> 
> Thanks for the patch, good catch!
> 
> > Test: No graphical artifacts with shadertoy
> 
> We've never used this commit trailer before, please let's not start now.
> 
> The subject should plainly state the "what", and the commit message
> should answer the "why". You do have that here, but I think the subject
> is still a bit too much nuts and bolts.
> 
> For example,
> 
> 	drm/i915/display: Fix dma_fence_wait_timeout() return value handling
> 
> would state the "what" in *much* more helpful terms for anyone looking
> at git logs.
> 
> Presumably this has been an error for some time. You should do a little
> bit of git blame on the offending lines. It'll find commit d59cf7bb73f3
> ("drm/i915/display: Use dma_fence interfaces instead of i915_sw_fence").
> 
> Based on that, we should add:
> 
> Fixes: d59cf7bb73f3 ("drm/i915/display: Use dma_fence interfaces instead of i915_sw_fence")
> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> Cc: Jouni Högander <jouni.hogander at intel.com>
> Cc: <stable at vger.kernel.org> # v6.8+
> 
> Then it occurs to me this looks like a common mistake to make. A little
> bit of git grep on dma_fence_wait_timeout() quickly finds multiple
> similar mistakes in drm, at least amdgpu, etnaviv, msm, and tegra. Cc
> some maintainers FYI. This class of bugs could cause issues elsewhere.
> 
> Let's also Cc Julia and Dan in case they have ideas to improve static
> analysis to catch this class of bugs. Or maybe one exists already, but
> we're just not running them!

It's easy enough to warn about when we have:

	ret = dma_fence_wait_timeout();

and ret is an int.

In Smatch I actually had hardcoded dma_fence_wait_timeout() as only
returning up to INT_MAX because there were enough places which saved it
as an int and it triggered false positives in callers where we knew the
timeout was reasonable.

regards,
dan carpenter



More information about the Intel-gfx mailing list