[Mesa-dev] [PATCH] i965: Fix comment about DRM_IOCTL_I915_GEM_WAIT.
Kristian Høgsberg
krh at bitplanet.net
Wed Jul 15 13:27:24 PDT 2015
On Wed, Jul 15, 2015 at 12:41 PM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> On Wed, Jul 15, 2015 at 12:20:15PM -0700, Kristian Høgsberg wrote:
>> On Wed, Jul 15, 2015 at 10:22 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
>> > From: Chris Wilson <chris at chris-wilson.co.uk>
>> >
>> > The kernel actually waits forever when supplied a timeout value < 0,
>> > rather than returning immediately. See i915_gem_wait_ioctl() in
>> > i915_gem.c's call to __i915_wait_request().
>> >
>> > (split by Ken from a large patch authored by Chris Wilson)
>> >
>> > Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
>> > ---
>> > src/mesa/drivers/dri/i965/intel_syncobj.c | 6 +++---
>> > 1 file changed, 3 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/intel_syncobj.c b/src/mesa/drivers/dri/i965/intel_syncobj.c
>> > index c44c4be..c2f4fa9 100644
>> > --- a/src/mesa/drivers/dri/i965/intel_syncobj.c
>> > +++ b/src/mesa/drivers/dri/i965/intel_syncobj.c
>> > @@ -105,9 +105,9 @@ brw_fence_client_wait(struct brw_context *brw, struct brw_fence *fence,
>> > assert(fence->batch_bo);
>> >
>> > /* DRM_IOCTL_I915_GEM_WAIT uses a signed 64 bit timeout and returns
>> > - * immediately for timeouts <= 0. The best we can do is to clamp the
>> > - * timeout to INT64_MAX. This limits the maximum timeout from 584 years to
>> > - * 292 years - likely not a big deal.
>> > + * immediately for timeout == 0, and indefinitely if timeout is negative.
>> > + * The best we can do is to clamp the timeout to INT64_MAX. This limits
>> > + * the maximum timeout from 584 years to 292 years - likely not a big deal.
>>
>> No, there are kernel versions in the wild that has this bug. The
>> comment after the patch says:
>>
>> /* DRM_IOCTL_I915_GEM_WAIT uses a signed 64 bit timeout and returns
>> * immediately for timeout == 0, and indefinitely if timeout is negative.
>> * The best we can do is to clamp the timeout to INT64_MAX. This limits
>> * the maximum timeout from 584 years to 292 years - likely not a big deal.
>>
>> which doesn't make sense. If we feel like we need to point out that
>> we've fixed the bug, that's fine, but keep the part about how some
>> kernels are broken so it's clear why the workaround is needed.
>
> This is not a workaround for the kernel bug but an impedence mismatch
> for the uint64_t GL interface and int64_t kernel interface.
We used to pass the GL uint64_t to the kernel, and before 3.17 that
looked like it did thee right thing. Of course, any value over
INT64_MAX would wait forever, but it's hard to tell the difference
between that and 292 years. When the kernel bug happened, the common
"wait for UINT64_MAX ns" case broke. Not sure if the old behavior was
intentional, but it did handle the impedance mismatch (perhaps
inadvenrtently) as well as the current code. When the kernel broke, I
changed the way we handle the mismatch - not because the "wait 292
years" behaviour is better, but because it works around the kernel
bug. So I'd certainly call the current behavior a workaround as much
as it's handling impedance mismatch.
> The text certainly didn't describe the kernel interface at the time it
> was introduced nor at the present time. If you want to rewrite it to
> mention the bug, then do so, but don't do that claiming to be a
> description of the kernel interface.
>
> If you really want to mention the inconsequential bug, then include:
>
> * except once upon a time, shortly from 3.17, a bug was introduced that
> * made the kernel return instantly for negative timeouts.
>
> It doesn't change the code in the slightest since you still have to make
> a judgment call about what to do with the timeout values of greater than
> 292 years. Looping at 292 years is closer than never.
It's not about changing the code, it's about documenting why we're
doing it this way instead of the "round up to infinity" for the
timeout > INT64_MAX cases. It's not an inconsequential bug, it breaks
the main use case of GL_ARB_sync, so I think it's worth having a
comment there to avoid accidentally regressing that.
Kristian
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
More information about the mesa-dev
mailing list