[RFC] drm/radeon: userfence IOCTL

Mon Apr 13 08:35:04 PDT 2015

On 13.04.2015 17:25, Serguei Sagalovitch wrote:
> > the BO to be kept in the same place while it is mapped inside the 
> kernel page table
> ...
> > So this requires that we pin down the BO for the duration of the 
> wait IOCTL.
>
> But my understanding is that it should be not duration of "wait" IOCTL 
> but "duration" of command buffer execution.
>
> BTW: I would assume that this is not the new scenario.
>
>  This is scenario:
>     - User allocate BO
>     - User get CPU address for BO
>     - User submit command buffer to write to BO
>     - User could "poll" / "read" or "write" BO data by CPU
>
> So when  TTM needs  to move BO to another location it should also 
> update CPU "mapping" correctly so user will always read / write the 
> correct data.
>
> Did I miss anything?

The problem is that kernel mappings are not updated when TTM moves the 
buffer around. In the case of a swapped out buffer that wouldn't even be 
possible cause kernel mappings aren't pageable.

You just can't map the BO into kernel space unless you have it pinned 
down, so you can't check the current value written in the BO in your IOCTL.

One alternative is to send all interrupts in question unfiltered to user 
space and let userspace do the check if the right value was written or 
not. But I assume that this would be rather bad for performance.

Another alternative would be to use the userspace mapping to check the 
BO value, but this approach isn't compatible with a GPU scheduler. E.g. 
you can't really do cross process space memory access in device drivers.

Regards,
Christian.

>
>
> Sincerely yours,
> Serguei Sagalovitch
>
> On 15-04-13 10:52 AM, Christian König wrote:
>> Hello everyone,
>>
>> we have a requirement for a bit different kind of fence handling. Currently we handle fences completely inside the kernel, but in the future we would like to emit multiple fences inside the same IB as well.
>>
>> This works by adding multiple fence commands into an IB which just write their value to a specific location inside a BO and trigger the appropriate hardware interrupt.
>>
>> The user part of the driver stack should then be able to call an IOCTL to wait for the interrupt and block for the value (or something larger) to be written to the specific location.
>>
>> This has the advantage that you can have multiple synchronization points in the same IB and don't need to split up your draw commands over several IBs so that the kernel can insert kernel fences in between.
>>
>> The following set of patches tries to implement exactly this IOCTL. The big problem with that IOCTL is that TTM needs the BO to be kept in the same place while it is mapped inside the kernel page table. So this requires that we pin down the BO for the duration of the wait IOCTL.
>>
>> This practically gives userspace a way of pinning down BOs for as long as it wants, without the ability for the kernel for intervention.
>>
>> Any ideas how to avoid those problems? Or better ideas how to handle the new requirements?
>>
>> Please note that the patches are only hacked together quick&dirty to demonstrate the problem and not very well tested.
>>
>> Best regards,
>> Christian.
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20150413/e5f7dc45/attachment.html>