[PULL REQUEST] ttm fence conversion

Tue Sep 2 02:52:46 PDT 2014

Am 02.09.2014 um 11:12 schrieb Maarten Lankhorst:
> Op 02-09-14 om 10:51 schreef Christian König:
>> Am 01.09.2014 um 20:43 schrieb Maarten Lankhorst:
>>> Hey,
>>>
>>> On 01-09-14 18:21, Christian König wrote:
>>>> Am 01.09.2014 um 15:33 schrieb Maarten Lankhorst:
>>>>> Hey,
>>>>>
>>>>> Op 01-09-14 om 14:31 schreef Christian König:
>>>>>> Please wait a second with that.
>>>>>>
>>>>>> I didn't had a chance to test this yet and nobody has yet given it's rb on at least the radeon changes in this branch.
>>>>> Ok, my fault. I thought it was implicitly acked. I haven't made any functional changes to these patches,
>>>>> just some small fixups and a fix to make it apply after the upstream removal of  RADEON_FENCE_SIGNALED_SEQ.
>>>> Yeah, but the resulting patch looks to complex for my taste and should be simplified a bit more. Here is a more detailed review:
>>>>
>>>>> +    wait_queue_t fence_wake;
>>>> Only a nitpick, but please fix the indention and maybe add a comment.
>>>>
>>>>> +    struct work_struct delayed_irq_work;
>>>> Just drop that, the new fall back work item should take care of this when the unfortunate case happens that somebody tries to enable_signaling in the middle of a GPU reset.
>>> I can only drop it if radeon_gpu_reset will always call radeon_irq_set after downgrading to read mode, even if no work needs to be done. :-)
>>>
>>> Then again, should be possible.
>> The fall back handler should take care of the rare condition that we can't activate the IRQ because the driver is in a lockup handler.
>>
>> The issue is that the delayed_irq_work handler needs to take the exclusive lock once more and so would block an innocent process for the duration of the GPU lockup.
>>
>> Either reschedule as delayed work item if we can't take the lock immediately or just live with the delay of the fall back handler. Since IRQs usually don't work correctly immediately after an GPU reset I'm pretty sure that the fallback handler will be needed anyway.
> Ok, rescheduling would be fine. Or could I go with the alternative, remove the delayed_irq_work and always set irqs after downgrading the write lock?

Always setting the IRQ's after downgrading the write lock would work for 
me as well.

>
>>>>>    /*
>>>>> - * Cast helper
>>>>> - */
>>>>> -#define to_radeon_fence(p) ((struct radeon_fence *)(p))
>>>>> -
>>>>> -/*
>>>> Please define the new cast helper in radeon.h as well.
>>> The ops are only defined in radeon_fence.c, and nothing outside of radeon_fence.c should care about the internals.
>> Then define this as a function instead, I need a checked cast from a fence to a radeon_fence outside of the fence code as well.
> Ok.
>
>>>>>        if (!rdev->needs_reset) {
>>>>> -        up_write(&rdev->exclusive_lock);
>>>>> +        downgrade_write(&rdev->exclusive_lock);
>>>>> +        wake_up_all(&rdev->fence_queue);
>>>>> +        up_read(&rdev->exclusive_lock);
>>>>>            return 0;
>>>>>        }
>>>> Just drop that as well, no need to wake up anybody here.
>>> Maybe not, but if I have to remove delayed_irq_work I do need to add a radeon_irq_set here.
>>>
>>>>>    downgrade_write(&rdev->exclusive_lock);
>>>>> +    wake_up_all(&rdev->fence_queue);
>>>> Same here, the IB test will wake up all fences for recheck anyway.
>>> Same as previous comment. :-)
>>>
>>>>> + * radeon_fence_read_seq - Returns the current fence value without updating
>>>>> + *
>>>>> + * @rdev: radeon_device pointer
>>>>> + * @ring: ring index to return the seqno of
>>>>> + */
>>>>> +static uint64_t radeon_fence_read_seq(struct radeon_device *rdev, int ring)
>>>>> +{
>>>>> +    uint64_t last_seq = atomic64_read(&rdev->fence_drv[ring].last_seq);
>>>>> +    uint64_t last_emitted = rdev->fence_drv[ring].sync_seq[ring];
>>>>> +    uint64_t seq = radeon_fence_read(rdev, ring);
>>>>> +
>>>>> +    seq = radeon_fence_read(rdev, ring);
>>>>> +    seq |= last_seq & 0xffffffff00000000LL;
>>>>> +    if (seq < last_seq) {
>>>>> +        seq &= 0xffffffff;
>>>>> +        seq |= last_emitted & 0xffffffff00000000LL;
>>>>> +    }
>>>>> +    return seq;
>>>>> +}
>>>> Completely drop that and just check the last_seq signaled as set by radeon_fence_activity.
>>> Do you mean call radeon_fence_activity in radeon_fence_signaled? Or should I just use the cached value in radeon_fence_check_signaled.
>> Just check the cached value, it should be updated by radeon_fence_activity immediately before calling this anyway.
> Ok. I think I wrote this as a workaround for unreliable interrupts. :-)
>
>>> I can't call fence_activity in check_signaled, because it would cause re-entrancy in fence_queue.
>>>
>>>>> +        if (!ret)
>>>>> +            FENCE_TRACE(&fence->base, "signaled from irq context\n");
>>>>> +        else
>>>>> +            FENCE_TRACE(&fence->base, "was already signaled\n");
>>>> Is all that text tracing necessary? Probably better define a trace point here.
>>> It gets optimized out normally. There's already a tracepoint called in fence_signal.
>>>   
>>>>> +    if (atomic64_read(&rdev->fence_drv[fence->ring].last_seq) >= fence->seq ||
>>>>> +        !rdev->ddev->irq_enabled)
>>>>> +        return false;
>>>> Checking irq_enabled here might not be such a good idea if the fence code don't has a fall back on it's own. What exactly happens if enable_signaling returns false?
>>> I thought irq_enabled couldn't happen under normal circumstances?
>> Not 100% sure, but I think it is temporary turned off during reset.
>>
>>> Anyway the fence gets treated as signaled if it returns false, and fence_signal will get called.
>> Thought so, well that's rather bad if we failed to install the IRQ handle that we just treat all fences as signaled isn't it?
> I wrote this code before the delayed work was added, I guess the check for !irq_enabled can be removed now. :-)
>
>>>>> +static signed long radeon_fence_default_wait(struct fence *f, bool intr,
>>>>> +                         signed long timeout)
>>>>> +{
>>>>> +    struct radeon_fence *fence = to_radeon_fence(f);
>>>>> +    struct radeon_device *rdev = fence->rdev;
>>>>> +    bool signaled;
>>>>> +
>>>>> +    fence_enable_sw_signaling(&fence->base);
>>>>> +
>>>>> +    /*
>>>>> +     * This function has to return -EDEADLK, but cannot hold
>>>>> +     * exclusive_lock during the wait because some callers
>>>>> +     * may already hold it. This means checking needs_reset without
>>>>> +     * lock, and not fiddling with any gpu internals.
>>>>> +     *
>>>>> +     * The callback installed with fence_enable_sw_signaling will
>>>>> +     * run before our wait_event_*timeout call, so we will see
>>>>> +     * both the signaled fence and the changes to needs_reset.
>>>>> +     */
>>>>> +
>>>>> +    if (intr)
>>>>> +        timeout = wait_event_interruptible_timeout(rdev->fence_queue,
>>>>> +                               ((signaled = (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->base.flags))) || rdev->needs_reset),
>>>>> +                               timeout);
>>>>> +    else
>>>>> +        timeout = wait_event_timeout(rdev->fence_queue,
>>>>> +                         ((signaled = (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->base.flags))) || rdev->needs_reset),
>>>>> +                         timeout);
>>>>> +
>>>>> +    if (timeout > 0 && !signaled)
>>>>> +        return -EDEADLK;
>>>>> +    return timeout;
>>>>> +}
>>>> This at least needs to be properly formated, but I think since we now don't need extra handling any more we don't need an extra wait function as well.
>>> I thought of removing the extra handling, but the -EDEADLK stuff is needed because a deadlock could happen in ttm_bo_lock_delayed_workqueue otherwise if the gpu's really hung there would never be any progress forward.
>> Hui what? ttm_bo_lock_delayed_workqueue shouldn't call any blocking wait.
> Oops, you're right. ttm_bo_delayed_delete is called with remove_all false, not true.
>
> Unfortunately ttm_bo_vm_fault does hold the exclusive_lock in read mode, and other places that use eviction will use it too.
> Without returning -EDEADLK this could mean that ttm_bo_move_accel_cleanup would block forever,
> so this function has to stay.

Ok, fine with me. I'm not deep enough into the TTM code to really judge 
this, but my understanding was that TTM still calls it's own wait callback.

Regards,
Christian.

>
> ~Maarten
>