[PATCH v4 00/14] RFC Support hot device unplug in amdgpu
andrey.grodzovsky at amd.com
Wed Feb 24 16:30:50 UTC 2021
On 2021-02-19 5:24 a.m., Daniel Vetter wrote:
> On Thu, Feb 18, 2021 at 9:03 PM Andrey Grodzovsky
> <Andrey.Grodzovsky at amd.com> wrote:
>> Looked a bit into it, I want to export sync_object to FD and import from that FD
>> such that I will wait on the imported sync object handle from one thread while
>> signaling the exported sync object handle from another (post device unplug) ?
>> My problem is how to create a sync object with a non signaled 'fake' fence ?
>> I only see API that creates it with already signaled fence (or none) -
>> P.S I expect the kernel to crash since unlike with dma_bufs we don't hold
>> drm device reference here on export.
> Well maybe there's no crash. I think if you go through all your
> dma_fence that you have and force-complete them, then I think external
> callers wont go into the driver anymore. But there's still pointers
> potentially pointing at your device struct and all that, but should
> work. Still needs some audit ofc.
> Wrt how you get such a free-standing fence, that's amdgpu specific. Roughly
> - submit cs
> - get the fence for that (either sync_file, but I don't think amdgpu
> supports that, or maybe through drm_syncobj)
> - hotunplug
> - wait on that fence somehow (drm_syncobj has direct uapi for this,
> same for sync_file I think)
> Cheers, Daniel
Indeed worked fine, did with 2 devices. Since syncobj is refcounted,
even after I
destroyed the original syncobj and unplugged the device, the exported
syncobj and the
fence inside didn't go anywhere.
See my 3 tests in my branch on Gitlab
and let me know if I should go ahead and do a merge request (into which
target project/branch ?) or you
have more comments.
>> On 2/9/21 4:50 AM, Daniel Vetter wrote:
>>> Yeah in the end we'd need 2 hw devices for testing full fence
>>> functionality. A useful intermediate step would be to just export the
>>> fence (either as sync_file, which I think amdgpu doesn't support because
>>> no android egl support in mesa) or drm_syncobj (which you can do as
>>> standalone fd too iirc), and then just using the fence a bit from
>>> userspace (like wait on it or get its status) after the device is
More information about the amd-gfx