radeon ring 0 test failed on arm64
Christian König
christian.koenig at amd.com
Wed May 26 11:21:28 UTC 2021
Hi Robin,
Am 26.05.21 um 12:59 schrieb Robin Murphy:
> On 2021-05-26 10:42, Christian König wrote:
>> Hi Robin,
>>
>> Am 25.05.21 um 22:09 schrieb Robin Murphy:
>>> On 2021-05-25 14:05, Alex Deucher wrote:
>>>> On Tue, May 25, 2021 at 8:56 AM Peter Geis <pgwipeout at gmail.com>
>>>> wrote:
>>>>>
>>>>> On Tue, May 25, 2021 at 8:47 AM Alex Deucher
>>>>> <alexdeucher at gmail.com> wrote:
>>>>>>
>>>>>> On Tue, May 25, 2021 at 8:42 AM Peter Geis <pgwipeout at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Good Evening,
>>>>>>>
>>>>>>> I am stress testing the pcie controller on the rk3566-quartz64
>>>>>>> prototype SBC.
>>>>>>> This device has 1GB available at <0x3 0x00000000> for the PCIe
>>>>>>> controller, which makes a dGPU theoretically possible.
>>>>>>> While attempting to light off a HD7570 card I manage to get a
>>>>>>> modeset
>>>>>>> console, but ring0 test fails and disables acceleration.
>>>>>>>
>>>>>>> Note, we do not have UEFI, so all PCIe setup is from the Linux
>>>>>>> kernel.
>>>>>>> Any insight you can provide would be much appreciated.
>>>>>>
>>>>>> Does your platform support PCIe cache coherency with the CPU? I.e.,
>>>>>> does the CPU allow cache snoops from PCIe devices? That is required
>>>>>> for the driver to operate.
>>>>>
>>>>> Ah, most likely not.
>>>>> This issue has come up already as the GIC isn't permitted to snoop on
>>>>> the CPUs, so I doubt the PCIe controller can either.
>>>>>
>>>>> Is there no way to work around this or is it dead in the water?
>>>>
>>>> It's required by the pcie spec. You could potentially work around it
>>>> if you can allocate uncached memory for DMA, but I don't think that is
>>>> possible currently. Ideally we'd figure out some way to detect if a
>>>> particular platform supports cache snooping or not as well.
>>>
>>> There's device_get_dma_attr(), although I don't think it will work
>>> currently for PCI devices without an OF or ACPI node - we could
>>> perhaps do with a PCI-specific wrapper which can walk up and defer
>>> to the host bridge's firmware description as necessary.
>>>
>>> The common DMA ops *do* correctly keep track of per-device coherency
>>> internally, but drivers aren't supposed to be poking at that
>>> information directly.
>>
>> That sounds like you underestimate the problem. ARM has unfortunately
>> made the coherency for PCI an optional IP.
>
> Sorry to be that guy, but I'm involved a lot internally with our
> system IP and interconnect, and I probably understand the situation
> better than 99% of the community ;)
I need to apologize, didn't realized who was answering :)
It just sounded to me that you wanted to suggest to the end user that
this is fixable in software and I really wanted to avoid even more
customers coming around asking how to do this.
> For the record, the SBSA specification (the closet thing we have to a
> "system architecture") does require that PCIe is integrated in an
> I/O-coherent manner, but we don't have any control over what people do
> in embedded applications (note that we don't make PCIe IP at all, and
> there is plenty of 3rd-party interconnect IP).
So basically it is not the fault of the ARM IP-core, but people are just
stitching together PCIe interconnect IP with a core where it is not
supposed to be used with.
Do I get that correctly? That's an interesting puzzle piece in the picture.
>> So we are talking about a hardware limitation which potentially can't
>> be fixed without replacing the hardware.
>
> You expressed interest in "some way to detect if a particular platform
> supports cache snooping or not", by which I assumed you meant a
> software method for the amdgpu/radeon drivers to call, rather than,
> say, a website that driver maintainers can look up SoC names on. I'm
> saying that that API already exists (just may need a bit more work).
> Note that it is emphatically not a platform-level thing since
> coherency can and does vary per device within a system.
Well, I think this is not something an individual driver should mess
with. What the driver should do is just express that it needs coherent
access to all of system memory and if that is not possible fail to load
with a warning why it is not possible.
>
> I wasn't suggesting that Linux could somehow make coherency magically
> work when the signals don't physically exist in the interconnect - I
> was assuming you'd merely want to do something like throw a big
> warning and taint the kernel to help triage bug reports. Some drivers
> like ahci_qoriq and panfrost simply need to know so they can program
> their device to emit the appropriate memory attributes either way, and
> rely on the DMA API to hide the rest of the difference, but if you
> want to treat non-coherent use as unsupported because it would require
> too invasive changes that's fine by me.
Yes exactly that please. I mean not sure how panfrost is doing it, but
at least the Vulkan userspace API specification requires devices to have
coherent access to system memory.
So even if I would want to do this it is simply not possible because the
application doesn't tell the driver which memory is accessed by the
device and which by the CPU.
Christian.
>
> Robin.
More information about the amd-gfx
mailing list