[PATCH] drm/[amdgpu|radeon]: fix memset on io mem
Robin Murphy
robin.murphy at arm.com
Fri Dec 18 14:42:35 UTC 2020
On 2020-12-18 06:14, Chen Li wrote:
[...]
>>> No, not performance. See standards like OpenGL, Vulkan as well as VA-API and
>>> VDPAU require that you can mmap() device memory and execute memset/memcpy on
>>> the memory from userspace.
>>>
>>> If your ARM base board can't do that for some then you can't use the hardware
>>> with that board.
>>
>> If the VRAM lives in a prefetchable PCI bar then on most sane Arm-based systems
>> I believe it should be able to mmap() to userspace with the Normal memory type,
>> where unaligned accesses and such are allowed, as opposed to the Device memory
>> type intended for MMIO mappings, which has more restrictions but stricter
>> ordering guarantees.
>
> Hi, Robin. I cannot understand it allow unaligned accesses. prefetchable PCI bar should also be mmio, and accesses will end with device memory, so why does this allow unaligned access?
Because even Device-GRE is a bit too restrictive to expose to userspace
that's likely to expect it to behave as regular memory, so, for better
or worse, we use MT_NORMAL_MC for pgrprot_writecombine().
>> Regardless of what happens elsewhere though, if something is mapped *into the
>> kernel* with ioremap(), then it is fundamentally wrong per the kernel memory
>> model to reference that mapping directly without using I/O accessors. That is
>> not specific to any individual architecture, and Sparse should be screaming
>> about it already. I guess in this case the UVD code needs to pay more attention
>> to whether radeon_bo_kmap() ends up going via ttm_bo_ioremap() or not.
>>
>> (I'm assuming the initial fault was memset() with 0 trying to perform "DC ZVA"
>> on a Device-type mapping from ioremap() - FYI a stacktrace on its own without
>> the rest of the error dump showing what actually triggered it isn't overly
>> useful)
>>
>> Robin.
> why it may be 'DC ZVA'? I'm not sure the pc in initial kernel fault memset, but I capture the userspace crash pc: stp(128bit) or str with neon(also 128bit) to render node(/dev/dri/renderD128).
As I said it was an assumption. I guessed at it being more likely to be
an MMU fault than an external abort, and given the size and the fact
that it's a variable initialisation guessed at it being slightly more
likely to hit the ZVA special-case rather than being unaligned. Looking
again, I guess starting at an odd-numbered 32-bit element might lead to
an unaligned store of XZR, but either way it doesn't really matter -
what it showed is it clearly *could* be an MMU fault because TTM seems
to be a bit careless with iomem pointers.
That said, if you're also getting external aborts from your host bridge
not liking 128-bit transactions, then as Christian says you're probably
going to have a bad time on this platform either way.
Robin.
More information about the dri-devel
mailing list