[PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

Thu May 24 09:41:30 UTC 2018

Am 24.05.2018 um 11:24 schrieb Qiang Yu:
> On Thu, May 24, 2018 at 2:46 PM, Christian König
> <christian.koenig at amd.com> wrote:
> [SNIP]
>> Because of this we have a separate tracking in amdgpu so that we not only
>> know who is using which BO, who is using which VM.
> amdgpu's VM implementation seems too complicated for this simple mali GPU,
> but I may investigate it more to see if I can make it better.

Yeah, completely agree.

The VM handling in amdgpu is really complicated because we had to tune 
it for multiple use cases. E.g. partial resident textures, delayed 
updates etc etc....

But you should at least be able to take the lessons learned we had with 
that VM code and not make the same mistakes again.

>> We intentionally removed the preclose callback to prevent certain use cases,
>> bringing it back to allow your use case looks rather fishy to me.
> Seems other drivers do either the deffer or wait way to adopt the drop
> of preclose. I can do the same as you suggested, but just not understand why
> we make our life harder. Can I know what's the case you want to prevent?

I think what matters most for your case is the issue is that drivers 
should handle closing a BO because userspace said so in the same way it 
handles closing a BO because of a process termination, but see below.

>> BTW: What exactly is the issue with using the postclose callback?
> The issue is, when Ctrl+C to terminate an application, if no wait or deffer
> unmap, buffer just gets unmapped before task is done, so kernel driver
> gets MMU fault and HW reset to recover the GPU.

Yeah, that sounds like exactly one of the reasons we had the callback in 
the first place and worked on to removing it.

See the intention is to have reliable handling, e.g. use the same code 
path for closing a BO because of an IOCTL and closing a BO because of 
process termination.

In other words what happens when userspace closes a BO while the GPU is 
still using it? Would you then run into a GPU reset as well?

I mean it's your driver stack, so I'm not against it as long as you can 
live with it. But it's exactly the thing we wanted to avoid here.

Regards,
Christian.