[Mesa-dev] [PATCH 07/10] util/u_queue: add an option to resize the queue when it's full

Tue Jul 11 15:25:53 UTC 2017

On Tue, Jul 11, 2017 at 12:05 PM, Grazvydas Ignotas <notasas at gmail.com> wrote:
> On Tue, Jul 11, 2017 at 12:21 AM, Marek Olšák <maraeo at gmail.com> wrote:
>> From: Marek Olšák <marek.olsak at amd.com>
>>
>> Consider the following situation:
>>   mtx_lock(mutex);
>>   do_something();
>>   util_queue_add_job(...);
>>   mtx_unlock(mutex);
>>
>> If the queue is full, util_queue_add_job will wait for a free slot.
>> If the job which is currently being executed tries to lock the mutex,
>> it will be stuck forever, because util_queue_add_job is stuck.
>>
>> The deadlock can be trivially resolved by increasing the queue size
>> (reallocating the queue) in util_queue_add_job if the queue is full.
>> Then util_queue_add_job becomes wait-free.
>>
>> radeonsi will use it.
>
> Can't this cause the queue to grow uncontrollably, like on GPU hangs,
> making already difficult to debug situations worse? Perhaps
> util_queue_add_job() could have a non-blocking-fail option and the
> caller could then retry after releasing the mutex for a bit.

The thing with GPU hangs is that the driver is unable to continue its
operation and will be stuck one way or another.

The caller can't release the mutex, because it has done an operation
(do_something() above) that must be done together with
util_queue_add_job and can't be separated. The atomicity of command
submission starts with the first mtx_lock call. Things are
irreversible after do_something(). The only two possible outcomes is
that util_queue_add_job either succeeds or waits and then succeeds.
There is no other option.

Marek