[RFC] drm/radeon: userfence IOCTL

Mon Apr 13 09:55:19 PDT 2015

On 13.04.2015 18:08, Jerome Glisse wrote:
> On Mon, Apr 13, 2015 at 05:45:21PM +0200, Christian König wrote:
>> On 13.04.2015 17:31, Jerome Glisse wrote:
>>> On Mon, Apr 13, 2015 at 04:52:14PM +0200, Christian König wrote:
>>>> Hello everyone,
>>>>
>>>> we have a requirement for a bit different kind of fence handling.
>>>> Currently we handle fences completely inside the kernel, but in
>>>> the future we would like to emit multiple fences inside the same
>>>> IB as well.
>>>>
>>>> This works by adding multiple fence commands into an IB which
>>>> just write their value to a specific location inside a BO and
>>>> trigger the appropriate hardware interrupt.
>>>>
>>>> The user part of the driver stack should then be able to call an
>>>> IOCTL to wait for the interrupt and block for the value (or
>>>> something larger) to be written to the specific location.
>>>>
>>>> This has the advantage that you can have multiple synchronization
>>>> points in the same IB and don't need to split up your draw commands
>>>> over several IBs so that the kernel can insert kernel fences in
>>>> between.
>>>>
>>>> The following set of patches tries to implement exactly this IOCTL.
>>>> The big problem with that IOCTL is that TTM needs the BO to be
>>>> kept in the same place while it is mapped inside the kernel page
>>>> table. So this requires that we pin down the BO for the duration
>>>> of the wait IOCTL.
>>>>
>>>> This practically gives userspace a way of pinning down BOs for as
>>>> long as it wants, without the ability for the kernel for intervention.
>>>>
>>>> Any ideas how to avoid those problems? Or better ideas how to handle
>>>> the new requirements?
>>> So i think the simplest solution is to only allow such "fence" bo to
>>> be inside system memory (no vram for them). My assumption here is that
>>> such BO will barely see more than couple dword write so it is not a
>>> bandwidth intensive BO. Or do you have a requirement for such BO to
>>> be in VRAM ?
>> Not that I know off.
>>
>>> Now to do that i would just add a property to buffer object that
>>> effectively forbid such BO to be place anywhere else than GTT. Doing
>>> that would make the ioctl code simpler, just check the BO as the
>>> GTT only property set and if not return -EINVAL. Then its a simple
>>> matter of kmapping the proper page.
>> I've also considered adding an internal flag that when a buffer is kmapped
>> we avoid moving it to VRAM / swapping it out, but see below.
>>
>>> Note that the only thing that would be left to forbid is the swaping
>>> of the buffer due to memory pressure (from various ttm/core shrinker).
>> Yeah, how the heck would I do that? That's internals of TTM that I never got
>> into.
> Actualy i think it is easier then i first thought, in the wait ioctl
> check if the buffer has a pending fence ie gpu is still using it, if
> not return -EAGAIN because it means that it is pointless to wait for
> next GPU interrupt.
>
> For as long as the BO has an active fence it will not be swapped out
> (see ttm_bo_swapout()). So in the wait event test both the value and
> the pending fence. If the pending fence signal but not the value then
> return -EAGAIN. In all case just keep a kmap of the page (do not kmap
> the using existing kmap helper we would need something new to not
> interfer with the shrinker). Not that after testing the value you would
> need to check that the BO was not move and thus the page you were
> looking into is still the one the BO is using.
>
> That way userspace can not abuse this ioctl to block the shrinker from
> making progress.

So what we do on the start of the IOCTL is to check the BOs fences and 
see if it actually is still used and note it's current placement.

Then map it so the kernel can access it and in the waiting loop we check 
if it still has a fence and is still in the same place.

If there isn't any fences left or the placement has changed we simply 
assume that the fence is signaled.

Yeah, that actually should work.

Thanks for the tip,
Christian.

>
> I need to look at ttm kmap code to see if it is actually useable without
> disrupting the shrinker. Will do that after lunch.
>
> Cheers,
> Jérôme
>
>> Thanks for the ideas,
>> Christian.
>>
>>> Cheers,
>>> Jérôme