[PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal

Andrey Grodzovsky Andrey.Grodzovsky at amd.com
Wed Feb 10 22:01:37 UTC 2021

On 2/9/21 10:40 AM, Christian König wrote:
> Am 09.02.21 um 15:30 schrieb Andrey Grodzovsky:
>> [SNIP]
>>>> Question - Why can't we just set those PTEs to point to system memory 
>>>> (another RO dummy page)
>>>> filled with 1s ?
>>> Then writes are not discarded. E.g. the 1s would change to something else.
>>> Christian.
>> I see but, what about marking the mappings as RO and discarding the write 
>> access page faults continuously until the device is finalized ?
>> Regarding using an unused range behind the upper bridge as Daniel suggested, 
>> I wonder will this interfere with
>> the upcoming feature to support BARs movement  during hot plug - 
>> https://www.spinics.net/lists/linux-pci/msg103195.html ?
> In the picture in my head the address will never be used.
> But it doesn't even needs to be an unused range of the root bridge. That one 
> is usually stuffed full by the BIOS.
> For my BAR resize work I looked at quite a bunch of NB documentation. At least 
> for AMD CPUs we should always have an MMIO address which is never ever used. 
> That makes this platform/CPU dependent, but the code is just minimal.
> The really really nice thing about this approach is that you could unit test 
> and audit the code for problems even without *real* hotplug hardware. E.g. we 
> can swap the kernel PTEs and see how the whole power and display code reacts 
> to that etc....
> Christian.

Tried to play with hacking mmio tracer as Daniel suggested but just hanged the 
machine so...

Can you tell me how to dynamically obtain this kind of unused MMIO address ? I 
think given such address, writing
to which is dropped and reading from return all 1s, I can then do something like 
bellow, if that what u meant -

for (address = adev->rmmio; address < adev->rmmio_size; adress += PAGE_SIZE) {

     old_pte = get_locked_pte(init_mm, address)
     dummy_pte = pfn_pte(fake_mmio_address, 0);
     set_pte(&old_pte, dummy_pte)


flush_tlb ???

P.S I hope to obtain thunderbolt eGPU adapter soon so even if this won't work I 
still will be able to test how the driver handles all 1s.


>> Andrey
>>>> Andrey
>>>>>> Christian.
>>>>>>> It's a nifty idea indeed otherwise ...
>>>>>>> -Daniel
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>>>>>> But ugh ...
>>>>>>>>>>>>> Otoh validating an entire driver like amdgpu without such a trick
>>>>>>>>>>>>> against 0xff reads is practically impossible. So maybe you need to 
>>>>>>>>>>>>> add
>>>>>>>>>>>>> this as one of the tasks here?
>>>>>>>>>>>> Or I could just for validation purposes return ~0 from all reg 
>>>>>>>>>>>> reads in the code
>>>>>>>>>>>> and ignore writes if drm_dev_unplugged, this could already easily 
>>>>>>>>>>>> validate a big
>>>>>>>>>>>> portion of the code flow under such scenario.
>>>>>>>>>>> Hm yeah if your really wrap them all, that should work too. Since
>>>>>>>>>>> iommappings have __iomem pointer type, as long as amdgpu is sparse
>>>>>>>>>>> warning free, should be doable to guarantee this.
>>>>>>>>>> Problem is that ~0 is not always a valid register value.
>>>>>>>>>> You would need to audit every register read that it doesn't use the 
>>>>>>>>>> returned
>>>>>>>>>> value blindly as index or similar. That is quite a bit of work.
>>>>>>>>> Yeah that's the entire crux here :-/
>>>>>>>>> -Daniel

More information about the amd-gfx mailing list