[Intel-gfx] [PATCH] drm/i915: Fix an incorrect free rather than derefence issue.

Daniel Vetter daniel at ffwll.ch
Fri Feb 13 04:04:39 PST 2015


On Fri, Feb 13, 2015 at 10:58 AM, Nick Hoath <nicholas.hoath at intel.com> wrote:
> On 13/02/2015 09:32, Daniel Vetter wrote:
>>
>> On Thu, Feb 12, 2015 at 12:29:21PM +0000, Nick Hoath wrote:
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652
>>>
>>> Signed-off-by: Nick Hoath <nicholas.hoath at intel.com>
>>
>>
>> Commit message is missing the absolutely crucial detail about which patch
>> introduced this regression:
>>
>> commit 6d3d8274bc45de4babb62d64562d92af984dd238
>> Author:     Nick Hoath <nicholas.hoath at intel.com>
>> AuthorDate: Thu Jan 15 13:10:39 2015 +0000
>>
>>      drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request
>>
>> Another thing I've noticed is that we explicitly drop the context
>> reference for the request before dropping the request reference. Without
>> clearing the req->ctx pointer. That has a very high chance to leading to
>> tears, imo the context unreferenceing should be pushed into
>> i915_gem_request_free.
>>
>> Except that it's there already, which means we have a double unref now?
>
>
> Looking at the code, it looks like that's the case.
>
>>
>> Also this patch is for the legacy ringbuffer code, but the referenced bug
>> is for gen8+ execlists. We're definitely not running this code here I
>> think.
>
>
> i915_gem_reset_ring_cleanup is used in execlists in the hang recovery case.

Oops I've missed that, somehow I've thought intel_lrc.c has it's own
reset recovery code. But I've just mixed up function names a bit.

>> Imo step one is to drop all the explicit ctx refcounting for req->ctx and
>> always rely on the implicit reference. Then see what happens.
>
>
> I agree that the refcounting needs re-evaluating after the merge of execlist
> queue entries & requests, however I think the cleanup of the double
> unref/removing the refcounting should be done in another patchset. This
> patch is purely to fix the issue raised in 88652. Depends on the relative
> priorities.

Well this patch here won't work because there's now a double unref.
And I spotted that one because intel_execlists_retire_requests seems
to have the exact same double unref already. Hence why I think we
should fix up that first and then reasses what's left.

The bug here (before your patch) is just a use-after-free, if there's
some other reference to the request. And it will also be fixed with
the redone req->ctx refcounting.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


More information about the Intel-gfx mailing list