[Mesa-dev] [PATCH] nouveau: fix double free when screen_create fails

samuel.pitoiset samuel.pitoiset at gmail.com
Mon Dec 28 12:48:07 PST 2015



On 12/11/2015 21:21, Samuel Pitoiset wrote:
>
>
> On 11/12/2015 08:51 PM, Samuel Pitoiset wrote:
>> Hi Emil,
>>
>> On 11/10/2015 04:35 PM, Emil Velikov wrote:
>>> Hi Samuel,
>>>
>>> Sorry about this I thought I already replied :-\
>>>
>>> On 29 October 2015 at 22:22, Samuel Pitoiset
>>> <samuel.pitoiset at gmail.com> wrote:
>>>> On 10/27/2015 02:01 PM, samuel.pitoiset wrote:
>>>>> On 27/10/2015 12:52, Emil Velikov wrote:
>>>>>>
>>>>>> On 27 October 2015 at 10:50, samuel.pitoiset
>>>>>> <samuel.pitoiset at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> On 27/10/2015 11:37, Emil Velikov wrote:
>>>>>>>>
>>>>>>>> On 22 October 2015 at 00:16, Julien Isorce
>>>>>>>> <julien.isorce at gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> The real fix is in nouveau_drm_winsys.c by setting dev to 0.
>>>>>>>>> Which means dev's ownership has been passed to previous call.
>>>>>>>>> Other changes are there to be consistent with what the
>>>>>>>>> screen_create functions already do on errors.
>>>>>>>>>
>>>>>>>>> Encountered this crash because nvc0_screen_create sometimes fails
>>>>>>>>> with:
>>>>>>>>> nvc0_screen_create:717 - Error allocating PGRAPH context for
>>>>>>>>> M2MF: -16
>>>>>>>>> Also see: https://bugs.freedesktop.org/show_bug.cgi?id=70354
>>>>>>>>>
>>>>>>>>> Signed-off-by: Julien Isorce <j.isorce at samsung.com>
>>>>>>>>> ---
>>>>>>>>>     src/gallium/drivers/nouveau/nv30/nv30_screen.c      | 5 ++++-
>>>>>>>>>     src/gallium/drivers/nouveau/nv50/nv50_screen.c      | 4 +++-
>>>>>>>>>     src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c | 2 ++
>>>>>>>>>     3 files changed, 9 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
>>>>>>>>> b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
>>>>>>>>> index 0330164..9b8ddac 100644
>>>>>>>>> --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
>>>>>>>>> +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
>>>>>>>>> @@ -425,8 +425,10 @@ nv30_screen_create(struct nouveau_device
>>>>>>>>> *dev)
>>>>>>>>>        unsigned oclass = 0;
>>>>>>>>>        int ret, i;
>>>>>>>>>
>>>>>>>>> -   if (!screen)
>>>>>>>>> +   if (!screen) {
>>>>>>>>> +      nouveau_device_del(&dev);
>>>>>>>>>           return NULL;
>>>>>>>>> +   }
>>>>>>>>>
>>>>>>>> Imho having these in screen_create() seems like the wrong 'layer'.
>>>>>>>> Shouldn't one call nouveau_device_dev() from within
>>>>>>>> nouveau_drm_screen_unref
>>>>>>>>     and explicitly call the latter if the calloc() (here and in
>>>>>>>> nv50/nvc0)
>>>>>>>> fails ?
>>>>>>>
>>>>>>>
>>>>>>> We can't do that because nouveau_drm_screen_unref() needs a valid
>>>>>>> nouveau_screen
>>>>>>> object and in this case it is NULL.
>>>>>>>
>>>>>> Ouch I was under the impression that we've brought back the
>>>>>> concept of
>>>>>> winsys in nouveau with the hash_table patches. Seems like we haven't
>>>>>> :(
>>>>>>
>>>>>> If we are to do so (split things just like the radeon/amdgpu winsys)
>>>>>> then we can kill two birds with one stone. The missing
>>>>>> device_del() on
>>>>>> calloc failure as well as other error paths in nvxx_screen_create().
>>>>>
>>>>>
>>>>> Okay, I'll have a look at how radeon/amdgpu split those things.
>>>>
>>>>
>>>> Well, this doesn't seem to be "trivial" to do it properly actually.
>>>> This is on my todolist (but not with a top priority) so, if someone
>>>> else want to send a patch for this stuff, feel free to do it. :)
>>>>
>>> On the contrary - it's pretty trivial 99% of the work is either code
>>> movement or sed job.
>>> On the other hand, it's might not turn out to be stable material
>>> (rather large diff). So if please a comment or two (something
>>> resembling my suggestion) and get feel free to push it.
>>>
>>> Roughly how many things do you have in your mesa todo list prior to
>>> nouveau_winsys ?
>>
>> Lot of things, mostly related to performance counters! ;)
>> Fixing a segfault when something else has failed doesn't sound like to
>> be a top priority for me. But... I agree this should be fixed, I'll have
>> a look this month.
>
> I meant, this segfault only happens when nvXX_screen_create() fails,
> it's pretty rare, and it's not a critical issue. :)

This issue has probably been fixed along with 
323d4da372298900ce02293a682ba563ac29f4cb (nouveau: fix screen creation 
failure paths).

If someone get a chance to test it, please report if it works.
Thanks.

>
>>
>>>
>>> Cheers,
>>> Emil
>>>


More information about the mesa-dev mailing list