Question about page table updates at BO destroy

Christian König deathsimple at
Wed Mar 22 15:47:43 UTC 2017

Hi Nicolai,

yeah, that is a known issue.

You don't necessary need to add all fences from the PD to the released 
BO, but immediately starting to clear the PTE would be a good idea.

amdgpu_gem_object_close() should call amdgpu_vm_clear_freed() if the 
PD/PT are swapped in at that moment.

This leaves only a very small window where the application could access 
freed up memory while the PTEs are cleared.

If we even want to close that one we could let amdgpu_vm_clear_freed() 
return the fence of the clear operation and add that to the BO in question.


Am 22.03.2017 um 16:06 schrieb Nicolai Hähnle:
> Hi all,
> there's a bit of a puzzle where I'm wondering whether there's a subtle 
> bug in the amdgpu kernel module.
> Basically, the concern is that a buggy user space driver might trigger 
> a sequence like this:
> 1. Submit a CS that accesses some BO _without_ adding that BO to the 
> buffer list.
> 2. Free that BO.
> 3. Some other task re-uses the memory underlying the BO.
> 4. The CS is submitted to the hardware and accesses memory that is now 
> already in use by somebody else, since there has been no update to the 
> page tables to reflect the freed BO.
> Obviously there's a user space bug in step 1, but the kernel must 
> still prevent the conflicting memory accesses, and I don't see where 
> it does.
> amdgpu_gem_object_close takes a reservation of the BO and the page 
> directory, but then simply backs off that reservation rather than 
> adding a fence, which I suspect is necessary.
> I believe that whenever we remove a BO from a VM, we must 
> unconditionally add the most recent page directory fence(?) to the BO. 
> Does that sound right?
> Cheers,
> Nicolai
