<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<br>
<br>
<div class="moz-cite-prefix">Am 18.03.22 um 08:51 schrieb Kever
Yang:<br>
</div>
<blockquote type="cite" cite="mid:7652b236-238c-4e8a-f1c5-e3b7f7f71be6@rock-chips.com">
<br>
On 2022/3/17 20:19, Peter Geis wrote:
<br>
<blockquote type="cite">On Wed, Mar 16, 2022 at 11:08 PM Kever
Yang <a class="moz-txt-link-rfc2396E" href="mailto:kever.yang@rock-chips.com"><kever.yang@rock-chips.com></a> wrote:
<br>
<blockquote type="cite">Hi Peter,
<br>
<br>
On 2022/3/17 08:14, Peter Geis wrote:
<br>
<blockquote type="cite">Good Evening,
<br>
<br>
I apologize for raising this email chain from the dead, but
there have
<br>
been some developments that have introduced even more
questions.
<br>
I've looped the Rockchip mailing list into this too, as this
affects
<br>
rk356x, and likely the upcoming rk3588 if [1] is to be
believed.
<br>
<br>
TLDR for those not familiar: It seems the rk356x series (and
possibly
<br>
the rk3588) were built without any outer coherent cache.
<br>
This means (unless Rockchip wants to clarify here) devices
such as the
<br>
ITS and PCIe cannot utilize cache snooping.
<br>
This is based on the results of the email chain [2].
<br>
<br>
The new circumstances are as follows:
<br>
The RPi CM4 Adventure Team as I've taken to calling them has
been
<br>
attempting to get a dGPU working with the very broken
Broadcom
<br>
controller in the RPi CM4.
<br>
Recently they acquired a SoQuartz rk3566 module which is pin
<br>
compatible with the CM4, and have taken to trying it out as
well.
<br>
<br>
This is how I got involved.
<br>
It seems they found a trivial way to force the Radeon R600
driver to
<br>
use Non-Cached memory for everything.
<br>
This single line change, combined with using memset_io
instead of
<br>
memset, allows the ring tests to pass and the card probes
successfully
<br>
(minus the DMA limitations of the rk356x due to the 32 bit
<br>
interconnect).
<br>
I discovered using this method that we start having
unaligned io
<br>
memory access faults (bus errors) when running glmark2-drm
(running
<br>
glmark2 directly was impossible, as both X and Wayland
crashed too
<br>
early).
<br>
I traced this to using what I thought at the time was an
unsafe memcpy
<br>
in the mesa stack.
<br>
Rewriting this function to force aligned writes solved the
problem and
<br>
allows glmark2-drm to run to completion.
<br>
With some extensive debugging, I found about half a dozen
memcpy
<br>
functions in mesa that if forced to be aligned would allow
Wayland to
<br>
start, but with hilarious display corruption (see [3]. [4]).
<br>
The CM4 team is convinced this is an issue with memcpy in
glibc, but
<br>
I'm not convinced it's that simple.
<br>
<br>
On my two hour drive in to work this morning, I got to
thinking.
<br>
If this was an memcpy fault, this would be universally
broken on arm64
<br>
which is obviously not the case.
<br>
So I started thinking, what is different here than with
systems known to work:
<br>
1. No IOMMU for the PCIe controller.
<br>
2. The Outer Cache Issue.
<br>
<br>
Robin:
<br>
My questions for you, since you're the smartest person I
know about
<br>
arm64 memory management:
<br>
Could cache snooping permit unaligned accesses to IO to be
safe?
<br>
Or
<br>
Is it the lack of an IOMMU that's causing the ali gnment
faults to become fatal?
<br>
Or
<br>
Am I insane here?
<br>
<br>
Rockchip:
<br>
Please update on the status for the Outer Cache errata for
ITS services.
<br>
</blockquote>
Our SoC design team has double check with ARM GIC/ITS IP team
for many
<br>
times, and the GITS_CBASER
<br>
of GIC600 IP does not support hardware bind or config to a fix
value, so
<br>
they insist this is an IP
<br>
limitation instead of a SoC bug, software should take care of
it :(
<br>
I will check again if we can provide errata for this issue.
<br>
</blockquote>
Thanks. This is necessary as the mbi-alias provides an imperfect
<br>
implementation of the ITS and causes certain PCIe cards (eg x520
Intel
<br>
10G NIC) to misbehave.
<br>
<br>
<blockquote type="cite">
<blockquote type="cite">Please provide an answer to the errata
of the PCIe controller, in
<br>
regard to cache snooping and buffering, for both the rk356x
and the
<br>
upcoming rk3588.
<br>
</blockquote>
<br>
Sorry, what is this?
<br>
</blockquote>
Part of the ITS bug is it expects to be cache coherent with the
CPU
<br>
cluster by design.
<br>
Due to the rk356x being implemented without an outer accessible
cache,
<br>
the ITS and other devices that require cache coherency (PCIe for
<br>
example) crash in fun ways.
<br>
</blockquote>
Then this is still the ITS issue, not PCIe issue.
<br>
PCIe is a peripheral bus controller like USB and other device, the
driver should maintain the "cache coherency" if there is any, and
there is no requirement for hardware cache coherency between PCIe
and CPU.</blockquote>
<br>
Well then I suggest to re-read the PCIe specification.<br>
<br>
Cache coherency is defined as mandatory there. Non-cache coherency
is an optional feature.<br>
<br>
See section <span style="left: 272.106px; top: 897.633px;
font-size: 16.7px; font-family: sans-serif; transform:
scaleX(0.866889);" role="presentation" dir="ltr">2.2.6.5 in the
PCIe 2.0 specification for a good example.</span><br>
<span style="left: 272.106px; top: 897.633px; font-size: 16.7px;
font-family: sans-serif; transform: scaleX(0.866889);" role="presentation" dir="ltr"><br>
Regards,<br>
Christian.</span><br>
<br>
<blockquote type="cite" cite="mid:7652b236-238c-4e8a-f1c5-e3b7f7f71be6@rock-chips.com">
<br>
We didn't see any transfer error on rk356x PCIe till now, we can
take a look if it's easy to reproduce.
<br>
<br>
Thanks,
<br>
- Kever
<br>
<br>
<br>
<blockquote type="cite">This means that rk356x cannot implement a
specification compliant ITS or PCIe.
<br>
>From the rk3588 source dump it appears it was produced
without an
<br>
outer accessible cache, which means if true it also will be
unable to
<br>
use any PCIe cards that implement cache coherency as part of
their
<br>
design.
<br>
<br>
<blockquote type="cite">
<br>
Thanks,
<br>
- Kever
<br>
<blockquote type="cite">[1]
<a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FJeffyCN%2Fmirrors%2Fcommit%2F0b985f29304dcb9d644174edacb67298e8049d4f&data=04%7C01%7Cchristian.koenig%40amd.com%7C8bdb8c3a6a2e4643bbfd08da08b42da4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637831867224766930%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=LcwZggIwIqjvzjDH2DUnIDwxsgk7WmhE9LK13knx36E%3D&reserved=0">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FJeffyCN%2Fmirrors%2Fcommit%2F0b985f29304dcb9d644174edacb67298e8049d4f&data=04%7C01%7Cchristian.koenig%40amd.com%7C8bdb8c3a6a2e4643bbfd08da08b42da4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637831867224766930%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=LcwZggIwIqjvzjDH2DUnIDwxsgk7WmhE9LK13knx36E%3D&reserved=0</a><br>
[2]
<a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F871rbdt4tu.wl-maz%40kernel.org%2FT%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C8bdb8c3a6a2e4643bbfd08da08b42da4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637831867224766930%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=fXALLO1EnGi2s8pClt6aMrUlzqDy2KDO8wzpi033qtU%3D&reserved=0">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F871rbdt4tu.wl-maz%40kernel.org%2FT%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C8bdb8c3a6a2e4643bbfd08da08b42da4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637831867224766930%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=fXALLO1EnGi2s8pClt6aMrUlzqDy2KDO8wzpi033qtU%3D&reserved=0</a><br>
[3]
<a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcdn.discordapp.com%2Fattachments%2F926487797844541510%2F953414755970850816%2Funknown.png&data=04%7C01%7Cchristian.koenig%40amd.com%7C8bdb8c3a6a2e4643bbfd08da08b42da4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637831867224766930%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=tx%2Bw9ayScUTftjWAFL0GY%2FADQswxEJGRUhgxDw2TSzQ%3D&reserved=0">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcdn.discordapp.com%2Fattachments%2F926487797844541510%2F953414755970850816%2Funknown.png&data=04%7C01%7Cchristian.koenig%40amd.com%7C8bdb8c3a6a2e4643bbfd08da08b42da4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637831867224766930%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=tx%2Bw9ayScUTftjWAFL0GY%2FADQswxEJGRUhgxDw2TSzQ%3D&reserved=0</a><br>
[4]
<a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcdn.discordapp.com%2Fattachments%2F926487797844541510%2F953424952042852422%2Funknown.png&data=04%7C01%7Cchristian.koenig%40amd.com%7C8bdb8c3a6a2e4643bbfd08da08b42da4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637831867224766930%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=8VXuZvQAhD%2FsQBJ6WEXe0YElD6wCI675oxqHesKhclY%3D&reserved=0">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcdn.discordapp.com%2Fattachments%2F926487797844541510%2F953424952042852422%2Funknown.png&data=04%7C01%7Cchristian.koenig%40amd.com%7C8bdb8c3a6a2e4643bbfd08da08b42da4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637831867224766930%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=8VXuZvQAhD%2FsQBJ6WEXe0YElD6wCI675oxqHesKhclY%3D&reserved=0</a><br>
<br>
Thank you everyone for your time.
<br>
<br>
Very Respectfully,
<br>
Peter Geis
<br>
<br>
On Wed, May 26, 2021 at 7:21 AM Christian König
<br>
<a class="moz-txt-link-rfc2396E" href="mailto:christian.koenig@amd.com"><christian.koenig@amd.com></a> wrote:
<br>
<blockquote type="cite">Hi Robin,
<br>
<br>
Am 26.05.21 um 12:59 schrieb Robin Murphy:
<br>
<blockquote type="cite">On 2021-05-26 10:42, Christian
König wrote:
<br>
<blockquote type="cite">Hi Robin,
<br>
<br>
Am 25.05.21 um 22:09 schrieb Robin Murphy:
<br>
<blockquote type="cite">On 2021-05-25 14:05, Alex
Deucher wrote:
<br>
<blockquote type="cite">On Tue, May 25, 2021 at 8:56
AM Peter Geis <a class="moz-txt-link-rfc2396E" href="mailto:pgwipeout@gmail.com"><pgwipeout@gmail.com></a>
<br>
wrote:
<br>
<blockquote type="cite">On Tue, May 25, 2021 at
8:47 AM Alex Deucher
<br>
<a class="moz-txt-link-rfc2396E" href="mailto:alexdeucher@gmail.com"><alexdeucher@gmail.com></a> wrote:
<br>
<blockquote type="cite">On Tue, May 25, 2021 at
8:42 AM Peter Geis <a class="moz-txt-link-rfc2396E" href="mailto:pgwipeout@gmail.com"><pgwipeout@gmail.com></a>
<br>
wrote:
<br>
<blockquote type="cite">Good Evening,
<br>
<br>
I am stress testing the pcie controller on
the rk3566-quartz64
<br>
prototype SBC.
<br>
This device has 1GB available at <0x3
0x00000000> for the PCIe
<br>
controller, which makes a dGPU theoretically
possible.
<br>
While attempting to light off a HD7570 card
I manage to get a
<br>
modeset
<br>
console, but ring0 test fails and disables
acceleration.
<br>
<br>
Note, we do not have UEFI, so all PCIe setup
is from the Linux
<br>
kernel.
<br>
Any insight you can provide would be much
appreciated.
<br>
</blockquote>
Does your platform support PCIe cache
coherency with the CPU? I.e.,
<br>
does the CPU allow cache snoops from PCIe
devices? That is required
<br>
for the driver to operate.
<br>
</blockquote>
Ah, most likely not.
<br>
This issue has come up already as the GIC isn't
permitted to snoop on
<br>
the CPUs, so I doubt the PCIe controller can
either.
<br>
<br>
Is there no way to work around this or is it
dead in the water?
<br>
</blockquote>
It's required by the pcie spec. You could
potentially work around it
<br>
if you can allocate uncached memory for DMA, but I
don't think that is
<br>
possible currently. Ideally we'd figure out some
way to detect if a
<br>
particular platform supports cache snooping or not
as well.
<br>
</blockquote>
There's device_get_dma_attr(), although I don't
think it will work
<br>
currently for PCI devices without an OF or ACPI node
- we could
<br>
perhaps do with a PCI-specific wrapper which can
walk up and defer
<br>
to the host bridge's firmware description as
necessary.
<br>
<br>
The common DMA ops *do* correctly keep track of
per-device coherency
<br>
internally, but drivers aren't supposed to be poking
at that
<br>
information directly.
<br>
</blockquote>
That sounds like you underestimate the problem. ARM
has unfortunately
<br>
made the coherency for PCI an optional IP.
<br>
</blockquote>
Sorry to be that guy, but I'm involved a lot internally
with our
<br>
system IP and interconnect, and I probably understand
the situation
<br>
better than 99% of the community ;)
<br>
</blockquote>
I need to apologize, didn't realized who was answering :)
<br>
<br>
It just sounded to me that you wanted to suggest to the
end user that
<br>
this is fixable in software and I really wanted to avoid
even more
<br>
customers coming around asking how to do this.
<br>
<br>
<blockquote type="cite">For the record, the SBSA
specification (the closet thing we have to a
<br>
"system architecture") does require that PCIe is
integrated in an
<br>
I/O-coherent manner, but we don't have any control over
what people do
<br>
in embedded applications (note that we don't make PCIe
IP at all, and
<br>
there is plenty of 3rd-party interconnect IP).
<br>
</blockquote>
So basically it is not the fault of the ARM IP-core, but
people are just
<br>
stitching together PCIe interconnect IP with a core where
it is not
<br>
supposed to be used with.
<br>
<br>
Do I get that correctly? That's an interesting puzzle
piece in the picture.
<br>
<br>
<blockquote type="cite">
<blockquote type="cite">So we are talking about a
hardware limitation which potentially can't
<br>
be fixed without replacing the hardware.
<br>
</blockquote>
You expressed interest in "some way to detect if a
particular platform
<br>
supports cache snooping or not", by which I assumed you
meant a
<br>
software method for the amdgpu/radeon drivers to call,
rather than,
<br>
say, a website that driver maintainers can look up SoC
names on. I'm
<br>
saying that that API already exists (just may need a bit
more work).
<br>
Note that it is emphatically not a platform-level thing
since
<br>
coherency can and does vary per device within a system.
<br>
</blockquote>
Well, I think this is not something an individual driver
should mess
<br>
with. What the driver should do is just express that it
needs coherent
<br>
access to all of system memory and if that is not possible
fail to load
<br>
with a warning why it is not possible.
<br>
<br>
<blockquote type="cite">I wasn't suggesting that Linux
could somehow make coherency magically
<br>
work when the signals don't physically exist in the
interconnect - I
<br>
was assuming you'd merely want to do something like
throw a big
<br>
warning and taint the kernel to help triage bug reports.
Some drivers
<br>
like ahci_qoriq and panfrost simply need to know so they
can program
<br>
their device to emit the appropriate memory attributes
either way, and
<br>
rely on the DMA API to hide the rest of the difference,
but if you
<br>
want to treat non-coherent use as unsupported because it
would require
<br>
too invasive changes that's fine by me.
<br>
</blockquote>
Yes exactly that please. I mean not sure how panfrost is
doing it, but
<br>
at least the Vulkan userspace API specification requires
devices to have
<br>
coherent access to system memory.
<br>
<br>
So even if I would want to do this it is simply not
possible because the
<br>
application doesn't tell the driver which memory is
accessed by the
<br>
device and which by the CPU.
<br>
<br>
Christian.
<br>
<br>
<blockquote type="cite">Robin.
<br>
</blockquote>
</blockquote>
</blockquote>
_______________________________________________
<br>
Linux-rockchip mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:Linux-rockchip@lists.infradead.org">Linux-rockchip@lists.infradead.org</a>
<br>
<a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.infradead.org%2Fmailman%2Flistinfo%2Flinux-rockchip&data=04%7C01%7Cchristian.koenig%40amd.com%7C8bdb8c3a6a2e4643bbfd08da08b42da4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637831867224766930%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=F77FbO3SqslbzKu2%2FnjRLrQF45kljtD3%2FAEXEFd7NQs%3D&reserved=0">https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.infradead.org%2Fmailman%2Flistinfo%2Flinux-rockchip&data=04%7C01%7Cchristian.koenig%40amd.com%7C8bdb8c3a6a2e4643bbfd08da08b42da4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637831867224766930%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=F77FbO3SqslbzKu2%2FnjRLrQF45kljtD3%2FAEXEFd7NQs%3D&reserved=0</a>
<br>
</blockquote>
</blockquote>
</blockquote>
<br>
</body>
</html>