PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available)
Icenowy Zheng
uwu at icenowy.me
Thu Jul 4 06:40:16 UTC 2024
在 2024-07-03星期三的 23:11 -0700,Christoph Hellwig写道:
> On Thu, Jul 04, 2024 at 10:00:52AM +0800, Icenowy Zheng wrote:
> > So I here want to ask a question as an individual hacker: what's
> > the
> > policy of linux-pci towards these non-coherent PCIe
> > implementations?
> >
> > If the sentences of Christian is right, these implementations are
> > just
> > out-of-spec, should them get purged out of the kernel, or at least
> > raising a warning that some HW won't work because of inconformant
> > implementation?
>
> Nothing in the PCIe specifications that mandates a programming model.
> Non-coherent DMA is extremely common in lower end devices, and
> despite
> all the issues that it causes well supported in Linux.
>
> What are you trying to solve?
Currently the DRM TTM subsystem (and GPU drivers using it) will assume
coherency and fail on these non-coherent systems with cryptic error
messages (like `[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx
test failed (-110)`) without mentioning coherency issues at all.
My original patchset tries to solve this problem by make the TTM
subsystem sensible of coherency status (and prevent CPU-side cached
mapping when non-coherent), but got argued by TTM maintainer and the
maintainer says TTM's ignorance on non-coherent systems is intentional.
>
More information about the dri-devel
mailing list