[PATCH] drm/[amdgpu|radeon]: fix memset on io mem

Robin Murphy robin.murphy at arm.com
Fri Dec 18 14:42:35 UTC 2020


On 2020-12-18 06:14, Chen Li wrote:
[...]
>>> No, not performance. See standards like OpenGL, Vulkan as well as VA-API and
>>> VDPAU require that you can mmap() device memory and execute memset/memcpy on
>>> the memory from userspace.
>>>
>>> If your ARM base board can't do that for some then you can't use the hardware
>>> with that board.
>>
>> If the VRAM lives in a prefetchable PCI bar then on most sane Arm-based systems
>> I believe it should be able to mmap() to userspace with the Normal memory type,
>> where unaligned accesses and such are allowed, as opposed to the Device memory
>> type intended for MMIO mappings, which has more restrictions but stricter
>> ordering guarantees.
>   
> Hi, Robin. I cannot understand it allow unaligned accesses. prefetchable PCI bar should also be mmio, and accesses will end with device memory, so why does this allow unaligned access?

Because even Device-GRE is a bit too restrictive to expose to userspace 
that's likely to expect it to behave as regular memory, so, for better 
or worse, we use MT_NORMAL_MC for pgrprot_writecombine().

>> Regardless of what happens elsewhere though, if something is mapped *into the
>> kernel* with ioremap(), then it is fundamentally wrong per the kernel memory
>> model to reference that mapping directly without using I/O accessors. That is
>> not specific to any individual architecture, and Sparse should be screaming
>> about it already. I guess in this case the UVD code needs to pay more attention
>> to whether radeon_bo_kmap() ends up going via ttm_bo_ioremap() or not.
>>
>> (I'm assuming the initial fault was memset() with 0 trying to perform "DC ZVA"
>> on a Device-type mapping from ioremap() - FYI a stacktrace on its own without
>> the rest of the error dump showing what actually triggered it isn't overly
>> useful)
>>
>> Robin.
> why it may be 'DC ZVA'? I'm not sure the pc in initial kernel fault memset, but I capture the userspace crash pc: stp(128bit) or str with neon(also 128bit) to render node(/dev/dri/renderD128).

As I said it was an assumption. I guessed at it being more likely to be 
an MMU fault than an external abort, and given the size and the fact 
that it's a variable initialisation guessed at it being slightly more 
likely to hit the ZVA special-case rather than being unaligned. Looking 
again, I guess starting at an odd-numbered 32-bit element might lead to 
an unaligned store of XZR, but either way it doesn't really matter - 
what it showed is it clearly *could* be an MMU fault because TTM seems 
to be a bit careless with iomem pointers.

That said, if you're also getting external aborts from your host bridge 
not liking 128-bit transactions, then as Christian says you're probably 
going to have a bad time on this platform either way.

Robin.


More information about the dri-devel mailing list