[Mesa-dev] [Mesa-stable] [PATCH] clover: Check the return value of pipe_loader_probe() when probing for devices

Francisco Jerez currojerez at riseup.net
Sat Nov 28 07:10:32 PST 2015


Emil Velikov <emil.l.velikov at gmail.com> writes:

> Hi Francisco,
>
> In all due respect, I am surprised with the volume of justification
> you put into some changes.
>
And apparently it wasn't enough? :P

> On 28 November 2015 at 11:52, Francisco Jerez <currojerez at riseup.net> wrote:
>> Francisco Jerez <currojerez at riseup.net> writes:
>>
>>> Tom Stellard <thomas.stellard at amd.com> writes:
>>>
>>>> When probing for devices, clover will call pipe_loader_probe() twice.
>>>> The first time to retrieve the number of devices, and then second time
>>>> to retrieve the device structures.
>>>>
>>>> We currently assume that the return value of both calls will be the
>>>> same, but this will not be the case if a device happens to disappear
>>>> between the two calls.
>>>>
>>>
>>> Assuming that the two sequential calls will give the same result is bad,
>>> but the likelihood of a device disappearing between the two calls is so
>>> small it's unlikely to happen more than once in a million years.  If
> Err... have to love this estimate :-P With some recent chances that
> probability increases ever so slightly, as we push the "can we find
> the module, does it have the entry point" from .create_screen and
> .configure further up to probe screen.
>

I was referring to the probability of a device actually disappearing
From the system between the two calls.

>>> this problem happens deterministically to you there's almost certainly a
>>> more serious bug in pipe-loader that will cause it to return an
>>> incorrect device count during the first call.
>>>
> Care to give an example. I'm not sure where/how this is possible.
>
The immediately following paragraph explained how and where it's
possible: pipe_loader_sw_probe() returns one regardless of whether the
software device is actually available.

>>> Indeed, pipe_loader_sw_probe() seems to return 1 unconditionally if ndev
>>> is zero, so it will give a different device count depending on ndev if
>>> pipe_loader_sw_probe_null() happens to fail.  Seems like a bug we don't
>>> want to simply paper over.
>>>
> Can you elaborate what exactly the problem here ? If we ask "how many
> we have" sure we can provide that [fixed] number. On the other hand,
> if we want more information about it (them), there is no guarantee
> that we'll succeed.

The return value of pipe_loader_probe() is supposed to be the number of
devices actually available on the system, it doesn't depend on the devs
pointer or ndev parameter -- See the doxygen comment in pipe_loader.h
for the expected behaviour and pipe_loader_drm_probe() for a correct
implementation.

>
>>> Anyway I'm not saying that clover's assumption shouldn't be fixed too --
>>> See comment below.
>>>
>>>> This patch removes this assumption and checks the return value of the
>>>> second pipe_loader_probe() call to ensure it does not try to initialize
>>>> devices that no longer exits.
>>>>
>>>> CC: <mesa-stable at lists.freedesktop.org>
>>>> ---
>>>>  src/gallium/state_trackers/clover/core/platform.cpp | 5 +++--
>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/src/gallium/state_trackers/clover/core/platform.cpp b/src/gallium/state_trackers/clover/core/platform.cpp
>>>> index 328b71c..689d692 100644
>>>> --- a/src/gallium/state_trackers/clover/core/platform.cpp
>>>> +++ b/src/gallium/state_trackers/clover/core/platform.cpp
>>>> @@ -28,9 +28,10 @@ platform::platform() : adaptor_range(evals(), devs) {
>>>>     int n = pipe_loader_probe(NULL, 0);
>>>>     std::vector<pipe_loader_device *> ldevs(n);
>>>>
>>>> -   pipe_loader_probe(&ldevs.front(), n);
>>>> +   n = pipe_loader_probe(&ldevs.front(), n);
>>>>
>>>> -   for (pipe_loader_device *ldev : ldevs) {
>>>> +   for (int i = 0; i < n; ++i) {
>>>> +      pipe_loader_device *ldev = ldevs[i];
>>>
>>> There are many reasons to prefer range loops to induction variables
>>> (including iterators), and there's no need to change it here: Because
>>> ldevs is zero-initialized you can just wrap the push_back(create()) call
>>> around an 'if (ldev)' to make this a one-liner.
>>>
>> Let me be more explicit about what could go wrong with this change in
>> its current form: The return value of pipe_loader_probe() is the number
>> of devices available on the system, so basically the same race condition
>> is still possible if 'n' increases from the first call to the second,
>> and in that case you'll end up reading uninitialized memory past the end
>> of the array -- Worse than a null dereference.  Range loops make this
>> kind of situation impossible.
>>
> If I understood you correctly, you're saying that on the second call
> can return count greater than the initial one.
>
> Errm... this looks like a bug in the pipe-loader. I don't think it's
> normal to claim that X devices are available only to return
> information for fewer devices. OK, calling it a bug might not be
> politically correct, although it is a misleading design decision, at
> the very least.
>
What I was saying is that if the rationale of this patch is correct and
clover may not make any assumptions about the number of available
devices remaining the same between calls to pipe_loader_probe(), this
patch doesn't fix the problem at all, and in fact introduces a worse
kind of bug (read from uninitialized memory).  The alternative I
suggested would be simpler and more robust, so it's worth changing
regardless of whether or not it's actually likely that the device count
will increase or decrease between calls.

> -Emil
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151128/934ab349/attachment.sig>


More information about the mesa-dev mailing list