<html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <div class="moz-cite-prefix">On 13.04.2015 17:25, Serguei Sagalovitch wrote: </div> <blockquote cite="mid:552BDFEA.6070806@amd.com" type="cite"> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> > the BO to be kept in the same place while it is mapped inside the kernel page table ... > So this requires that we pin down the BO for the duration of the wait IOCTL. But my understanding is that it should be not duration of "wait" IOCTL but "duration" of command buffer execution. BTW: I would assume that this is not the new scenario. This is scenario: - User allocate BO - User get CPU address for BO - User submit command buffer to write to BO - User could "poll" / "read" or "write" BO data by CPU So when TTM needs to move BO to another location it should also update CPU "mapping" correctly so user will always read / write the correct data. Did I miss anything? </blockquote> The problem is that kernel mappings are not updated when TTM moves the buffer around. In the case of a swapped out buffer that wouldn't even be possible cause kernel mappings aren't pageable. You just can't map the BO into kernel space unless you have it pinned down, so you can't check the current value written in the BO in your IOCTL. One alternative is to send all interrupts in question unfiltered to user space and let userspace do the check if the right value was written or not. But I assume that this would be rather bad for performance. Another alternative would be to use the userspace mapping to check the BO value, but this approach isn't compatible with a GPU scheduler. E.g. you can't really do cross process space memory access in device drivers. Regards, Christian. <blockquote cite="mid:552BDFEA.6070806@amd.com" type="cite"> Sincerely yours, Serguei Sagalovitch <div class="moz-cite-prefix">On 15-04-13 10:52 AM, Christian König wrote: </div> <blockquote cite="mid:1428936737-19103-1-git-send-email-deathsimple@vodafone.de" type="cite"> <pre wrap="">Hello everyone, we have a requirement for a bit different kind of fence handling. Currently we handle fences completely inside the kernel, but in the future we would like to emit multiple fences inside the same IB as well. This works by adding multiple fence commands into an IB which just write their value to a specific location inside a BO and trigger the appropriate hardware interrupt. The user part of the driver stack should then be able to call an IOCTL to wait for the interrupt and block for the value (or something larger) to be written to the specific location. This has the advantage that you can have multiple synchronization points in the same IB and don't need to split up your draw commands over several IBs so that the kernel can insert kernel fences in between. The following set of patches tries to implement exactly this IOCTL. The big problem with that IOCTL is that TTM needs the BO to be kept in the same place while it is mapped inside the kernel page table. So this requires that we pin down the BO for the duration of the wait IOCTL. This practically gives userspace a way of pinning down BOs for as long as it wants, without the ability for the kernel for intervention. Any ideas how to avoid those problems? Or better ideas how to handle the new requirements? Please note that the patches are only hacked together quick&dirty to demonstrate the problem and not very well tested. Best regards, Christian. </pre> </blockquote> </blockquote> </body> </html>