[PATCH 1/3] drm/radeon: move ring locking out of reset path

Mon Jul 2 09:15:05 PDT 2012

On Mon, Jul 2, 2012 at 11:58 AM, Christian König
<deathsimple at vodafone.de> wrote:
> On 02.07.2012 17:41, Jerome Glisse wrote:
>>
>> On Fri, Jun 29, 2012 at 12:15 PM, Michel Dänzer <michel at daenzer.net>
>> wrote:
>>>
>>> On Fre, 2012-06-29 at 17:18 +0200, Christian König wrote:
>>>>
>>>> On 29.06.2012 17:09, Michel Dänzer wrote:
>>>>>
>>>>> On Fre, 2012-06-29 at 16:45 +0200, Christian König wrote:
>>>>>>
>>>>>> Hold the ring lock the whole time the reset is in progress,
>>>>>> otherwise another process can submit new jobs.
>>>>>
>>>>> Sounds good, but doesn't this create other paths (e.g. initialization,
>>>>> resume) where the ring is being accessed without holding the lock?
>>>>> Isn't
>>>>> that a problem?
>>>>
>>>> Thought about that also.
>>>>
>>>> For init I'm pretty sure that no application can submit commands before
>>>> we are done, otherwise we are doomed anyway.
>>>>
>>>> For resume I'm not really sure, but I think that applications are
>>>> resumed after the hardware driver had a chance of doing so.
>>>
>>> I hope you're right... but if it's not too much trouble, it might be
>>> better to be safe than sorry and take the lock for those paths as well.
>>>
>>>
>> NAK this is the wrong way to solve the issue, we need a global lock on
>> all path that can trigger gpu activities. Previously it was the cs
>> mutex, but i haven't thought about it too much when it got removed. So
>> to fix the situation i am sending a patch with rw semaphore.
>
> So what I'm missing? What else can trigger GPU activity when not the rings?
>
> I'm currently working on ring-partial resets and also resets where you only
> skip over a single faulty IB instead of flushing the whole ring. And my
> current idea for that to work is that we hold the ring lock while we do
> suspend, ring_save, asic_reset, resume and ring_restore.
>
> Christian.
>

I should add that gpu_reset should be an heavy reset, and if you want
to only reset one ring and see if gpu can continue without heavy reset
then you should do it as a special ring reset that doesn't reset mc
and some other block but only the affected ring (and i am assuming
that hw behave properly here). If that light weight ring reset doesn't
work than let the heavy reset kicks in. So yes your light weight per
ring reset would only need to take the ring lock but now need to
change the ring lock usage we have now.

Cheers,
Jerome