[PATCH] drm/[amdgpu|radeon]: fix memset on io mem

Chen Li chenli at uniontech.com
Thu Dec 17 13:37:31 UTC 2020

On Thu, 17 Dec 2020 18:25:11 +0800,
Christian König wrote:
> Am 17.12.20 um 02:07 schrieb Chen Li:
> > On Wed, 16 Dec 2020 22:19:11 +0800,
> > Christian König wrote:
> >> Am 16.12.20 um 14:48 schrieb Chen Li:
> >>> On Wed, 16 Dec 2020 15:59:37 +0800,
> >>> Christian König wrote:
> >>>> [SNIP]
> >>> Hi, Christian. I'm not sure why this change is a hack here. I cannot see the problem and wll be grateful if you give more explainations.
> >> __memset is supposed to work on those addresses, otherwise you can't use the
> >> e8860 on your arm64 system.
> > If __memset is supposed to work on those adresses, why this commit(https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftorvalds%2Flinux%2Fcommit%2Fba0b2275a6781b2f3919d931d63329b5548f6d5f&data=04%7C01%7Cchristian.koenig%40amd.com%7C4ed3c075888746b7f41408d8a22811c5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637437640274023350%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HhWxUaLo3WpzoV6hjV%2BG1HICaIOXwsoNpzv5tNMNg8A%3D&reserved=0) is needed? (I also notice drm/radeon didn't take this change though) just out of curiosity.
> We generally accept those patches as cleanup in the kernel with the hope that we
> can find a way to work around the userspace restrictions.
What's the userspace restriction here? mmap device memory?
> But when you also have this issue in userspace then there isn't much we can do
> for you.
> >> Replacing the the direct write in the kernel with calls to writel() or
> >> memset_io() will fix that temporary, but you have a more general problem here.
> >   I cannot see what's the more general problem here :( u mean performance?
> No, not performance. See standards like OpenGL, Vulkan as well as VA-API and
> VDPAU require that you can mmap() device memory and execute memset/memcpy on the
> memory from userspace.
> If your ARM base board can't do that for some then you can't use the hardware
> with that board.

Good to know, thanks! BTW, have you ever seen or heard boards like mine which cannot mmap device memory correctly from userspace correctly?
> >>>> For amdgpu I suggest that we allocate the UVD message in GTT instead of VRAM
> >>>> since we don't have the hardware restriction for that on the new generations.
> >>>> 
> >>> Thanks, I will try to dig into deeper. But what's the "hardware restriction" meaning here? I'm not familiar with video driver stack and amd gpu, sorry.
> >> On older hardware (AGP days) the buffer had to be in VRAM (MMIO) memory, but on
> >> modern system GTT (system memory) works as well.
> > IIUC, e8860 can use amdgpu(I use radeon now) beause its device id 6822 is in amdgpu's table. But I cannot tell whether e8860 has iommu, and I cannot find iommu from lspci, so graphics translation table may not work here?
> That is not related to IOMMU. IOMMU is a feature of the CPU/motherboard. This is
> implemented using GTT, e.g. the VM page tables inside the GPU.
> And yes it should work I will prepare a patch for it.

I think you mean mmu :) Refer to wikipedia: https://en.wikipedia.org/wiki/Input%E2%80%93output_memory_management_unit#:~:text=In%20computing%2C%20an%20input%E2%80%93output,bus%20to%20the%20main%20memory.

    In computing, an input–output memory management unit (IOMMU) is a memory management unit (MMU) that connects a direct-memory-access–capable (DMA-capable) I/O bus to the main memory. Like a traditional MMU, which translates CPU-visible virtual addresses to physical addresses, the IOMMU maps device-visible virtual addresses (also called device addresses or I/O addresses in this context) to physical addresses. Some units also provide memory protection from faulty or malicious devices.
    An example IOMMU is the graphics address remapping table (GART) used by AGP and PCI Express graphics cards on Intel Architecture and AMD computers.

GART should be antoher abber of GTT(https://en.wikipedia.org/wiki/Graphics_address_remapping_table):

    The graphics address remapping table (GART),[1] also known as the graphics aperture remapping table,[2] or graphics translation table (GTT),[3] is an I/O memory management unit (IOMMU) used by Accelerated Graphics Port (AGP) and PCI Express (PCIe) graphics cards. 

> >>>> BTW: How does userspace work on arm64 then? The driver stack usually only works
> >>>> if mmio can be mapped directly.
> >>> I also post two usespace issue on mesa, and you may be interested with them:
> >>>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2F-%2Fissues%2F3954&data=04%7C01%7Cchristian.koenig%40amd.com%7C4ed3c075888746b7f41408d8a22811c5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637437640274023350%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ZR7pDS%2BCLUuMjCeKcMAXfHtbczt8WdUwSeLZCuHfCHw%3D&reserved=0
> >>>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2F-%2Fissues%2F3951&data=04%7C01%7Cchristian.koenig%40amd.com%7C4ed3c075888746b7f41408d8a22811c5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637437640274033344%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jAJo3aG2I1oIDTZXWhNgcKoKbd6tTdiAtc7vE4hJJPY%3D&reserved=0
> >>> I paste some virtual memory map in userspace there. (and the two problems do bother me quite a long time.)
> >> I don't really see a solution for those problems.
> >> 
> >> See it is perfectly valid for an application to memset/memcpy on mmaped MMIO
> >> space which comes from OpenGL or Vulkan.
> >> 
> >> So your CPU simply won't work with the hardware. We could work around that with
> >> a couple of hacks, but this is a pretty much general problem.
> >> 
> >> Regards,
> >> Christian.
> >   Thanks! Can you provid some details about these hacks? Should I post another
> > issue on the mail list?
> Adjust the kernel and/or user space to never map VRAM to the CPU.
> This violates the OpenGL/Vulkan specification in some ways. So not sure if that
> will work or not.
> Regards,
> Christian.
Well, if I never map vram to the cpu, then render node like renderD128 cannot expose to userspace any more and opencl also can't take effects, right? 

The final thing I still cannot figure out is why smplayer can play video with vaapi but cannot with vdpau? both them should take use renderD128 to render. 

More information about the dri-devel mailing list