Why can't ttm_tt_swapout() handle uncached or WC BOs?
Daniel Vetter
daniel at ffwll.ch
Wed Sep 23 09:29:11 UTC 2020
On Wed, Sep 23, 2020 at 11:24 AM Hellstrom, Thomas
<thomas.hellstrom at intel.com> wrote:
>
> On Wed, 2020-09-23 at 13:17 +1000, Dave Airlie wrote:
> > On Fri, 18 Sep 2020 at 00:49, Christian König <
> > christian.koenig at amd.com> wrote:
> > > Am 17.09.20 um 16:44 schrieb Michel Dänzer:
> > > > On 2020-09-17 2:20 p.m., Christian König wrote:
> > > > > Hi guys,
> > > > >
> > > > > Michel once submitted a patch to fix triggering this BUG_ON in
> > > > > ttm_tt_swapout():
> > > > >
> > > > > > BUG_ON(ttm->caching_state != tt_cached);
> > > > >
> > > > > Now my question is does anybody know why we have that in the
> > > > > first
> > > > > place?
> > > > >
> > > > > The only problematic thing I can see is calling copy_highpage()
> > > > > and
> > > > > that one is just doing a kmap_atomic()/kunmap_atomic() on the
> > > > > source
> > > > > and destination.
> > > > >
> > > > > I can't see why it should be problematic for this temporary
> > > > > mapping
> > > > > to be cached instead of uncached or WC?
> > > > >
> > > > > Does anybody has any idea?
> > > >
> > > > One thing is that AFAIK some (ARM?) CPUs can get very upset when
> > > > there's both a cached and uncacheable mapping for the same
> > > > physical page.
> > >
> > > Good point, but I already considered this.
> > >
> > > If there is another mapping for that page left we are completely
> > > busted
> > > anyway since we are currently changing the underlying storage.
> > >
> >
> > It's not just ARM of course, x86 PAT also has many issues about
> > multiple mappings of the same backing page with different attributes.
> >
> > The only mappings might be in the linear mapping, since not all pages
> > we get here will necessarily be high pages I assume and therefore
> > could have a linear mapping. If we've changed the linear mapping to
> > non-cached, then create a cached kmap I'm not 100% that won't cause
> > issues.
> >
> > but this is a definite maze of twisty passages and I'd probably need
> > to sit down and break it a bit more.
> >
> > Dave.
>
> This is a problem that goes back way far (12+) years that the x86
> architecture is not allowed to do aliased mappings of pages, even
> through mappable GTTs: There are two aspects:
>
> 1) Create a WC mapping of a page with data in the cache. When the cache
> does a writeback, anything written through the WC mapping will get
> overwritten.
>
> 2) Flushing the WB mappings first might not help, since at that time
> there were some AMD processors (Athlons?) that were speculatively
> prefetching data on the WB mapping into the cache at any time, and even
> though it wasn't changed it did a writeback. Anything written through
> WC in-between the prefetch and the writeback got overwritten. That was
> a real and occuring problem at that time. AMD claimed it was not
> violating specs.
>
> So aliased mappings is probably at best very fragile.
Intel has an architectural guarantee (apparently) that clean
cachelines get dropped, never written back. We heavily rely on that
with all our cacheline flushing tricks for integrated i915.ko. Since
SoC keep being non-coherent for the gpu for perf reasons I don't
expect these kind of games to go away anytime soon (we even do these
cache flushes in userspace nowadays, to cut out the kernel).
For discrete it's a different story ofc.
-Daniel
>
> /Thomas
>
>
>
>
> ----------------------------------------------------------------------
> Intel Sweden AB
> Registered Office: Isafjordsgatan 30B, 164 40 Kista, Stockholm, Sweden
> Registration Number: 556189-6027
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the dri-devel
mailing list