[PATCH 0/4] Refine GPU recovery sequence to enhance its stability

Mon Apr 12 17:44:46 UTC 2021

Am 12.04.21 um 19:27 schrieb Andrey Grodzovsky:
> On 2021-04-10 1:34 p.m., Christian König wrote:
>> Hi Andrey,
>>
>> Am 09.04.21 um 20:18 schrieb Andrey Grodzovsky:
>>> [SNIP]
>>>>>
>>>>> If we use a list and a flag called 'emit_allowed' under a lock 
>>>>> such that in amdgpu_fence_emit we lock the list, check the flag 
>>>>> and if true add the new HW fence to list and proceed to HW emition 
>>>>> as normal, otherwise return with -ENODEV. In amdgpu_pci_remove we 
>>>>> take the lock, set the flag to false, and then iterate the list 
>>>>> and force signal it. Will this not prevent any new HW fence 
>>>>> creation from now on from any place trying to do so ?
>>>>
>>>> Way to much overhead. The fence processing is intentionally lock 
>>>> free to avoid cache line bouncing because the IRQ can move from CPU 
>>>> to CPU.
>>>>
>>>> We need something which at least the processing of fences in the 
>>>> interrupt handler doesn't affect at all.
>>>
>>>
>>> As far as I see in the code, amdgpu_fence_emit is only called from 
>>> task context. Also, we can skip this list I proposed and just use 
>>> amdgpu_fence_driver_force_completion for each ring to signal all 
>>> created HW fences.
>>
>> Ah, wait a second this gave me another idea.
>>
>> See amdgpu_fence_driver_force_completion():
>>
>> amdgpu_fence_write(ring, ring->fence_drv.sync_seq);
>>
>> If we change that to something like:
>>
>> amdgpu_fence_write(ring, ring->fence_drv.sync_seq + 0x3FFFFFFF);
>>
>> Not only the currently submitted, but also the next 0x3FFFFFFF fences 
>> will be considered signaled.
>>
>> This basically solves out problem of making sure that new fences are 
>> also signaled without any additional overhead whatsoever.
>
>
> Problem with this is that the act of setting the sync_seq to some MAX 
> value alone is not enough, you actually have to call 
> amdgpu_fence_process to iterate and signal the fences currently stored 
> in ring->fence_drv.fences array and to guarantee that once you done 
> your signalling no more HW fences will be added to that array anymore. 
> I was thinking to do something like bellow:
>

Well we could implement the is_signaled callback once more, but I'm not 
sure if that is a good idea.

> amdgpu_fence_emit()
>
> {
>
>     dma_fence_init(fence);
>
>     srcu_read_lock(amdgpu_unplug_srcu)
>
>     if (!adev->unplug)) {
>
>         seq = ++ring->fence_drv.sync_seq;
>         emit_fence(fence);
>
> */* We can't wait forever as the HW might be gone at any point*/**
>        dma_fence_wait_timeout(old_fence, 5S);*
>

You can pretty much ignore this wait here. It is only as a last resort 
so that we never overwrite the ring buffers.

But it should not have a timeout as far as I can see.

>         ring->fence_drv.fences[seq & ring->fence_drv.num_fences_mask] 
> = fence;
>
>     } else {
>
>         dma_fence_set_error(fence, -ENODEV);
>         DMA_fence_signal(fence)
>
>     }
>
>     srcu_read_unlock(amdgpu_unplug_srcu)
>     return fence;
>
> }
>
> amdgpu_pci_remove
>
> {
>
>     adev->unplug = true;
>     synchronize_srcu(amdgpu_unplug_srcu)
>

Well that is just duplicating what drm_dev_unplug() should be doing on a 
different level.

Christian.

>     /* Past this point no more fence are submitted to HW ring and 
> hence we can safely call force signal on all that are currently there.
>      * Any subsequently created  HW fences will be returned signaled 
> with an error code right away
>      */
>
>     for_each_ring(adev)
>         amdgpu_fence_process(ring)
>
>     drm_dev_unplug(dev);
>     Stop schedulers
>     cancel_sync(all timers and queued works);
>     hw_fini
>     unmap_mmio
>
> }
>
>
> Andrey
>
>
>>
>>
>>>
>>>>>>
>>>>>> Alternatively grabbing the reset write side and stopping and then 
>>>>>> restarting the scheduler could work as well.
>>>>>>
>>>>>> Christian.
>>>>>
>>>>>
>>>>> I didn't get the above and I don't see why I need to reuse the GPU 
>>>>> reset rw_lock. I rely on the SRCU unplug flag for unplug. Also, 
>>>>> not clear to me why are we focusing on the scheduler threads, any 
>>>>> code patch to generate HW fences should be covered, so any code 
>>>>> leading to amdgpu_fence_emit needs to be taken into account such 
>>>>> as, direct IB submissions, VM flushes e.t.c
>>>>
>>>> You need to work together with the reset lock anyway, cause a 
>>>> hotplug could run at the same time as a reset.
>>>
>>>
>>> For going my way indeed now I see now that I have to take reset 
>>> write side lock during HW fences signalling in order to protect 
>>> against scheduler/HW fences detachment and reattachment during 
>>> schedulers stop/restart. But if we go with your approach  then 
>>> calling drm_dev_unplug and scoping amdgpu_job_timeout with 
>>> drm_dev_enter/exit should be enough to prevent any concurrent GPU 
>>> resets during unplug. In fact I already do it anyway - 
>>> https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedesktop.org%2F~agrodzov%2Flinux%2Fcommit%2F%3Fh%3Ddrm-misc-next%26id%3Def0ea4dd29ef44d2649c5eda16c8f4869acc36b1&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Ceefa9c90ed8c405ec3b708d8fc46daaa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637536728550884740%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=UiNaJE%2BH45iYmbwSDnMSKZS5z0iak0fNlbbfYqKS2Jo%3D&reserved=0
>>
>> Yes, good point as well.
>>
>> Christian.
>>
>>>
>>> Andrey
>>>
>>>
>>>>
>>>>
>>>> Christian.
>>>>
>>>>>
>>>>> Andrey
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>>
>>>>>>>> Andrey
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Andrey
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20210412/97272a9b/attachment-0001.htm>