[Intel-gfx] [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.

Tomas Elf tomas.elf at intel.com
Tue Jun 16 10:07:03 PDT 2015


On 16/06/2015 17:50, Chris Wilson wrote:
> On Tue, Jun 16, 2015 at 04:46:05PM +0100, Tomas Elf wrote:
>> On 16/06/2015 14:44, Daniel Vetter wrote:
>>> On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
>>>> When submitting semaphores in execlist mode the hang checker crashes in this
>>>> function because it is only runnable in ring submission mode. The reason this
>>>> is of particular interest to the TDR patch series is because we use semaphores
>>>> as a mean to induce hangs during testing (which is the recommended way to
>>>> induce hangs for gen8+). It's not clear how this is supposed to work in
>>>> execlist mode since:
>>>>
>>>> 1. This function requires a ring buffer.
>>>>
>>>> 2. Retrieving a ring buffer in execlist mode requires us to retrieve the
>>>> corresponding context, which we get from a request.
>>>>
>>>> 3. Retieving a request from the hang checker is not straight-forward since that
>>>> requires us to grab the struct_mutex in order to synchronize against the
>>>> request retirement thread.
>>>>
>>>> 4. Grabbing the struct_mutex from the hang checker is nothing that we will do
>>>> since that puts us at risk of deadlock since a hung thread might be holding the
>>>> struct_mutex already.
>>>>
>>>> Therefore it's not obvious how we're supposed to deal with this. For now, we're
>>>> doing an early exit from this function, which avoids any kernel panic situation
>>>> when running our own internal TDR ULT.
>>>>
>>>> Signed-off-by: Tomas Elf <tomas.elf at intel.com>
>>>
>>> We should have a Testcase: line here which mentions the igt testcase which
>>> provoke this bug. Or we need to fill this gap asap.
>>> -Daniel
>>
>> You know this better than I do: Is there an IGT test that submits a
>> semaphore in execlist mode? Because that's all you need to do to
>> reproduce this. We could certainly add one if there is none like
>> that already.
>
> No, we don't have anything submitting a hanging semaphore from
> userspace or igt specifically.
> -Chris
>

At first I thought that it would be ok to just submit any semaphore but 
I guess it would have to be a hanging semaphore specifically. Or at 
least a semaphore that does not progress ACTHD from one hang check 
period to the following (seeing as we check for ACTHD progression in 
ring_stuck() and then call semaphore_passed() that calls 
semaphore_waits_for() if ACTHD hasn't moved).

Fine, we'll have to add that then.

Thanks,
Tomas


More information about the Intel-gfx mailing list