[RFC PATCH v3 0/4] drm: Standardize device reset notification
Christian König
christian.koenig at amd.com
Wed Jun 21 15:09:14 UTC 2023
Am 21.06.23 um 17:06 schrieb André Almeida:
> Em 21/06/2023 04:42, Christian König escreveu:
>> Am 21.06.23 um 02:57 schrieb André Almeida:
>>> Hi,
>>>
>>> This is a new version of the documentation for DRM device resets. As
>>> I dived
>>> more in the subject, I started to believe that part of the problem
>>> was the lack
>>> of a DRM API to get reset information from the driver. With an API,
>>> we can
>>> better standardize reset queries, increase common code from both DRM
>>> and Mesa,
>>> and make easier to write end-to-end tests.
>>>
>>> So this patchset, along with the documentation, comes with a new
>>> IOCTL and two
>>> implementations of it for amdgpu and i915 (although just the former
>>> was really
>>> tested). This IOCTL uses the "context id" to query reset
>>> information, but this
>>> might be not generic enough to be included in a DRM API.
>>
>> Well the basic problem with that is that we don't have a standard DRM
>> context defined.
>>
>> If you want to do this you should probably start there first.
>
> Any idea on how to start this? I tried to find previous work about
> that, but I didn't find.
I'm not aware of any work in this area, maybe ping on the Mesa list as well.
Could be that someone looked into that but never send anything out.
>
>>
>> Apart from that this looks like a really really good idea to me,
>> especially that we document the reset expectations.
>
> I think I'll submit just the doc for the next version then, given that
> the IOCTL will need a lot of rework.
Yeah, agree completely.
Thanks,
Christian.
>
>>
>> Regards,
>> Christian.
>>
>>> At least for amdgpu,
>>> this information is encapsulated by libdrm so one can't just call
>>> the ioctl
>>> directly from the UMD as I was planning to, but a small refactor can
>>> be done to
>>> expose the id. Anyway, I'm sharing it as it is to gather feedback if
>>> this seems
>>> to work.
>>>
>>> The amdgpu and i915 implementations are provided as a mean of
>>> testing and as
>>> exemplification, and not as reference code yet, as the goal is more
>>> about the
>>> interface itself then the driver parts.
>>>
>>> For the documentation itself, after spending some time reading the
>>> reset path in
>>> the kernel in Mesa, I decide to rewrite it to better reflect how it
>>> works, from
>>> bottom to top.
>>>
>>> You can check the userspace side of the IOCLT here:
>>> Mesa:
>>> https://gitlab.freedesktop.org/andrealmeid/mesa/-/commit/cd687b22fb32c21b23596c607003e2a495f465
>>> libdrm:
>>> https://gitlab.freedesktop.org/andrealmeid/libdrm/-/commit/b31e5404893ee9a85d1aa67e81c2f58c1dac3c46
>>>
>>> For testing, I use this vulkan app that has an infinity loop in the
>>> shader:
>>> https://github.com/andrealmeid/vulkan-triangle-v1
>>>
>>> Feedbacks are welcomed!
>>>
>>> Thanks,
>>> André
>>>
>>> v2:
>>> https://lore.kernel.org/all/20230227204000.56787-1-andrealmeid@igalia.com/
>>> v1:
>>> https://lore.kernel.org/all/20230123202646.356592-1-andrealmeid@igalia.com/
>>>
>>> André Almeida (4):
>>> drm/doc: Document DRM device reset expectations
>>> drm: Create DRM_IOCTL_GET_RESET
>>> drm/amdgpu: Implement DRM_IOCTL_GET_RESET
>>> drm/i915: Implement DRM_IOCTL_GET_RESET
>>>
>>> Documentation/gpu/drm-uapi.rst | 51 ++++++++++++++++
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 +-
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 35 +++++++++++
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 5 ++
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 +
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 12 +++-
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 2 +
>>> drivers/gpu/drm/drm_debugfs.c | 2 +
>>> drivers/gpu/drm/drm_ioctl.c | 58
>>> +++++++++++++++++++
>>> drivers/gpu/drm/i915/gem/i915_gem_context.c | 18 ++++++
>>> drivers/gpu/drm/i915/gem/i915_gem_context.h | 2 +
>>> .../gpu/drm/i915/gem/i915_gem_context_types.h | 2 +
>>> drivers/gpu/drm/i915/i915_driver.c | 2 +
>>> include/drm/drm_device.h | 3 +
>>> include/drm/drm_drv.h | 3 +
>>> include/uapi/drm/drm.h | 21 +++++++
>>> include/uapi/drm/drm_mode.h | 15 +++++
>>> 17 files changed, 233 insertions(+), 3 deletions(-)
>>>
>>
More information about the amd-gfx
mailing list