[Freedreno] [PATCH 04/16] drm: msm: Flush the cache immediately after allocating pages
Jordan Crouse
jcrouse at codeaurora.org
Mon Nov 7 18:01:15 UTC 2016
On Mon, Nov 07, 2016 at 07:19:24AM -0500, Rob Clark wrote:
> On Mon, Nov 7, 2016 at 3:35 AM, Archit Taneja <architt at codeaurora.org> wrote:
> >
> >
> > On 11/06/2016 07:45 PM, Rob Clark wrote:
> >>
> >> On Fri, Nov 4, 2016 at 6:44 PM, Jordan Crouse <jcrouse at codeaurora.org>
> >> wrote:
> >>>
> >>> For reasons that are not entirely understood using dma_map_sg()
> >>> for nocache/write combine buffers doesn't always successfully flush
> >>> the cache after the memory is zeroed somewhere deep in the bowels
> >>> of the shmem code. My working theory is that the cache flush on
> >>> the swiotlb bounce buffer address work isn't always flushing what
> >>> we need.
> >>>
> >>> Instead of using dma_map_sg() directly kmap and flush each page
> >>> at allocate time. We could use invalidate + clean or just invalidate
> >>> if we wanted to but on ARM64 using a flush is safer and not much
> >>> slower for what we are trying to do.
> >>>
> >>> Hopefully someday I'll more clearly understand the relationship between
> >>> shmem kmap, vmap and the swiotlb bounce buffer and we can be smarter
> >>> about when and how we invalidate the caches.
> >>
> >>
> >> Like I mentioned on irc, we defn don't want to ever hit bounce
> >> buffers. I think the problem here is dma-mapping assumes we only
> >> support 32b physical addresses, which is not true (at least as long as
> >> we have iommu)..
> >>
> >> Archit hit a similar problem on the display side of things.
> >
> >
> > Yeah, the shmem allocated pages ended up being 33 bit addresses some times
> > on db820c. The msm driver sets the dma mask to a default of 32 bits.
> > The dma mapping api gets unhappy whenever we get sg chunks with 33 bit
> > addresses, and tries to use switolb for them. We eventually end up
> > overflowing the swiotlb.
> >
> > Setting the mask to 33 bits worked as a temporary hack.
>
> actually I think setting the mask is the correct thing to do, but
> probably should use dma_set_ fxn.. and I suspect that the PA for the
> iommu is larger than 33 bits. It is probably equal to the number of
> address lines that are wired up. Not quite sure what that is, but I
> don't think there are any devices where the iommu cannot map some
> physical pages.
The GPU/MMU combo supporting 48 bit PAs since at least the 4XX era. I'm not sure
if a 48 bit mask would break the older targets or not - you won't be getting any
addressable memory > 32 bits on those devices anyway.
I'll try the dma_set_mask trick with 48 bits and see how it pans out.
Jordan
--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
More information about the Freedreno
mailing list