[PULL REQUEST] ttm fence conversion

Mon Sep 1 11:43:16 PDT 2014

Hey,

On 01-09-14 18:21, Christian König wrote:
> Am 01.09.2014 um 15:33 schrieb Maarten Lankhorst:
>> Hey,
>>
>> Op 01-09-14 om 14:31 schreef Christian König:
>>> Please wait a second with that.
>>>
>>> I didn't had a chance to test this yet and nobody has yet given it's rb on at least the radeon changes in this branch.
>> Ok, my fault. I thought it was implicitly acked. I haven't made any functional changes to these patches,
>> just some small fixups and a fix to make it apply after the upstream removal of  RADEON_FENCE_SIGNALED_SEQ.
> 
> Yeah, but the resulting patch looks to complex for my taste and should be simplified a bit more. Here is a more detailed review:
> 
>> +    wait_queue_t fence_wake;
> Only a nitpick, but please fix the indention and maybe add a comment.
> 
>> +    struct work_struct delayed_irq_work;
> Just drop that, the new fall back work item should take care of this when the unfortunate case happens that somebody tries to enable_signaling in the middle of a GPU reset.

I can only drop it if radeon_gpu_reset will always call radeon_irq_set after downgrading to read mode, even if no work needs to be done. :-)

Then again, should be possible.
>>  /*
>> - * Cast helper
>> - */
>> -#define to_radeon_fence(p) ((struct radeon_fence *)(p))
>> -
>> -/*
> Please define the new cast helper in radeon.h as well.
The ops are only defined in radeon_fence.c, and nothing outside of radeon_fence.c should care about the internals.

>>      if (!rdev->needs_reset) {
>> -        up_write(&rdev->exclusive_lock);
>> +        downgrade_write(&rdev->exclusive_lock);
>> +        wake_up_all(&rdev->fence_queue);
>> +        up_read(&rdev->exclusive_lock);
>>          return 0;
>>      }
> Just drop that as well, no need to wake up anybody here.

Maybe not, but if I have to remove delayed_irq_work I do need to add a radeon_irq_set here.

>>  downgrade_write(&rdev->exclusive_lock);
>> +    wake_up_all(&rdev->fence_queue);
> Same here, the IB test will wake up all fences for recheck anyway.
Same as previous comment. :-)

>> + * radeon_fence_read_seq - Returns the current fence value without updating
>> + *
>> + * @rdev: radeon_device pointer
>> + * @ring: ring index to return the seqno of
>> + */
>> +static uint64_t radeon_fence_read_seq(struct radeon_device *rdev, int ring)
>> +{
>> +    uint64_t last_seq = atomic64_read(&rdev->fence_drv[ring].last_seq);
>> +    uint64_t last_emitted = rdev->fence_drv[ring].sync_seq[ring];
>> +    uint64_t seq = radeon_fence_read(rdev, ring);
>> +
>> +    seq = radeon_fence_read(rdev, ring);
>> +    seq |= last_seq & 0xffffffff00000000LL;
>> +    if (seq < last_seq) {
>> +        seq &= 0xffffffff;
>> +        seq |= last_emitted & 0xffffffff00000000LL;
>> +    }
>> +    return seq;
>> +}
> Completely drop that and just check the last_seq signaled as set by radeon_fence_activity.

Do you mean call radeon_fence_activity in radeon_fence_signaled? Or should I just use the cached value in radeon_fence_check_signaled.

I can't call fence_activity in check_signaled, because it would cause re-entrancy in fence_queue.

>> +        if (!ret)
>> +            FENCE_TRACE(&fence->base, "signaled from irq context\n");
>> +        else
>> +            FENCE_TRACE(&fence->base, "was already signaled\n");
> Is all that text tracing necessary? Probably better define a trace point here.
It gets optimized out normally. There's already a tracepoint called in fence_signal.

>> +    if (atomic64_read(&rdev->fence_drv[fence->ring].last_seq) >= fence->seq ||
>> +        !rdev->ddev->irq_enabled)
>> +        return false;
> Checking irq_enabled here might not be such a good idea if the fence code don't has a fall back on it's own. What exactly happens if enable_signaling returns false?

I thought irq_enabled couldn't happen under normal circumstances?
Anyway the fence gets treated as signaled if it returns false, and fence_signal will get called.

>> +static signed long radeon_fence_default_wait(struct fence *f, bool intr,
>> +                         signed long timeout)
>> +{
>> +    struct radeon_fence *fence = to_radeon_fence(f);
>> +    struct radeon_device *rdev = fence->rdev;
>> +    bool signaled;
>> +
>> +    fence_enable_sw_signaling(&fence->base);
>> +
>> +    /*
>> +     * This function has to return -EDEADLK, but cannot hold
>> +     * exclusive_lock during the wait because some callers
>> +     * may already hold it. This means checking needs_reset without
>> +     * lock, and not fiddling with any gpu internals.
>> +     *
>> +     * The callback installed with fence_enable_sw_signaling will
>> +     * run before our wait_event_*timeout call, so we will see
>> +     * both the signaled fence and the changes to needs_reset.
>> +     */
>> +
>> +    if (intr)
>> +        timeout = wait_event_interruptible_timeout(rdev->fence_queue,
>> +                               ((signaled = (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->base.flags))) || rdev->needs_reset),
>> +                               timeout);
>> +    else
>> +        timeout = wait_event_timeout(rdev->fence_queue,
>> +                         ((signaled = (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->base.flags))) || rdev->needs_reset),
>> +                         timeout);
>> +
>> +    if (timeout > 0 && !signaled)
>> +        return -EDEADLK;
>> +    return timeout;
>> +}
> This at least needs to be properly formated, but I think since we now don't need extra handling any more we don't need an extra wait function as well.

I thought of removing the extra handling, but the -EDEADLK stuff is needed because a deadlock could happen in ttm_bo_lock_delayed_workqueue otherwise if the gpu's really hung there would never be any progress forward.

~Maarten