Question about page table updates at BO destroy

Nicolai Hähnle nhaehnle at gmail.com
Wed Mar 22 15:06:47 UTC 2017


Hi all,

there's a bit of a puzzle where I'm wondering whether there's a subtle 
bug in the amdgpu kernel module.

Basically, the concern is that a buggy user space driver might trigger a 
sequence like this:

1. Submit a CS that accesses some BO _without_ adding that BO to the 
buffer list.
2. Free that BO.
3. Some other task re-uses the memory underlying the BO.
4. The CS is submitted to the hardware and accesses memory that is now 
already in use by somebody else, since there has been no update to the 
page tables to reflect the freed BO.

Obviously there's a user space bug in step 1, but the kernel must still 
prevent the conflicting memory accesses, and I don't see where it does.

amdgpu_gem_object_close takes a reservation of the BO and the page 
directory, but then simply backs off that reservation rather than adding 
a fence, which I suspect is necessary.

I believe that whenever we remove a BO from a VM, we must 
unconditionally add the most recent page directory fence(?) to the BO. 
Does that sound right?

Cheers,
Nicolai



More information about the amd-gfx mailing list