[Mesa-stable] [PATCH] clover: Check the return value of pipe_loader_probe() when probing for devices

Emil Velikov emil.l.velikov at gmail.com
Sat Nov 28 06:42:04 PST 2015


Hi Francisco,

In all due respect, I am surprised with the volume of justification
you put into some changes.

On 28 November 2015 at 11:52, Francisco Jerez <currojerez at riseup.net> wrote:
> Francisco Jerez <currojerez at riseup.net> writes:
>
>> Tom Stellard <thomas.stellard at amd.com> writes:
>>
>>> When probing for devices, clover will call pipe_loader_probe() twice.
>>> The first time to retrieve the number of devices, and then second time
>>> to retrieve the device structures.
>>>
>>> We currently assume that the return value of both calls will be the
>>> same, but this will not be the case if a device happens to disappear
>>> between the two calls.
>>>
>>
>> Assuming that the two sequential calls will give the same result is bad,
>> but the likelihood of a device disappearing between the two calls is so
>> small it's unlikely to happen more than once in a million years.  If
Err... have to love this estimate :-P With some recent chances that
probability increases ever so slightly, as we push the "can we find
the module, does it have the entry point" from .create_screen and
.configure further up to probe screen.

>> this problem happens deterministically to you there's almost certainly a
>> more serious bug in pipe-loader that will cause it to return an
>> incorrect device count during the first call.
>>
Care to give an example. I'm not sure where/how this is possible.

>> Indeed, pipe_loader_sw_probe() seems to return 1 unconditionally if ndev
>> is zero, so it will give a different device count depending on ndev if
>> pipe_loader_sw_probe_null() happens to fail.  Seems like a bug we don't
>> want to simply paper over.
>>
Can you elaborate what exactly the problem here ? If we ask "how many
we have" sure we can provide that [fixed] number. On the other hand,
if we want more information about it (them), there is no guarantee
that we'll succeed.

>> Anyway I'm not saying that clover's assumption shouldn't be fixed too --
>> See comment below.
>>
>>> This patch removes this assumption and checks the return value of the
>>> second pipe_loader_probe() call to ensure it does not try to initialize
>>> devices that no longer exits.
>>>
>>> CC: <mesa-stable at lists.freedesktop.org>
>>> ---
>>>  src/gallium/state_trackers/clover/core/platform.cpp | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/gallium/state_trackers/clover/core/platform.cpp b/src/gallium/state_trackers/clover/core/platform.cpp
>>> index 328b71c..689d692 100644
>>> --- a/src/gallium/state_trackers/clover/core/platform.cpp
>>> +++ b/src/gallium/state_trackers/clover/core/platform.cpp
>>> @@ -28,9 +28,10 @@ platform::platform() : adaptor_range(evals(), devs) {
>>>     int n = pipe_loader_probe(NULL, 0);
>>>     std::vector<pipe_loader_device *> ldevs(n);
>>>
>>> -   pipe_loader_probe(&ldevs.front(), n);
>>> +   n = pipe_loader_probe(&ldevs.front(), n);
>>>
>>> -   for (pipe_loader_device *ldev : ldevs) {
>>> +   for (int i = 0; i < n; ++i) {
>>> +      pipe_loader_device *ldev = ldevs[i];
>>
>> There are many reasons to prefer range loops to induction variables
>> (including iterators), and there's no need to change it here: Because
>> ldevs is zero-initialized you can just wrap the push_back(create()) call
>> around an 'if (ldev)' to make this a one-liner.
>>
> Let me be more explicit about what could go wrong with this change in
> its current form: The return value of pipe_loader_probe() is the number
> of devices available on the system, so basically the same race condition
> is still possible if 'n' increases from the first call to the second,
> and in that case you'll end up reading uninitialized memory past the end
> of the array -- Worse than a null dereference.  Range loops make this
> kind of situation impossible.
>
If I understood you correctly, you're saying that on the second call
can return count greater than the initial one.

Errm... this looks like a bug in the pipe-loader. I don't think it's
normal to claim that X devices are available only to return
information for fewer devices. OK, calling it a bug might not be
politically correct, although it is a misleading design decision, at
the very least.

-Emil


More information about the mesa-stable mailing list