amdgpu/TTM oopses since merging swiotlb_dma_ops into the dma_direct code

Thu Jan 10 17:52:26 UTC 2019

On Thu, 10 Jan 2019 at 15:48, Christoph Hellwig <hch at lst.de> wrote:
>
> On Thu, Jan 10, 2019 at 03:00:31PM +0100, Christian König wrote:
> >>  From the trace it looks like we git the case where swiotlb tries
> >> to copy back data from a bounce buffer, but hits a dangling or NULL
> >> pointer.  So a couple questions for the submitter:
> >>
> >>   - does the system have more than 4GB memory and thus use swiotlb?
> >>     (check /proc/meminfo, and if something SWIOTLB appears in dmesg)
> >>   - does the device this happens on have a DMA mask smaller than
> >>     the available memory, that is should swiotlb be used here to start
> >>     with?
> >
> > Rather unlikely. The device is an AMD GPU, so we can address memory up to
> > 1TB.
>
> So we probably somehow got a false positive.
>
> For now I'like the reported to confirm that the dma_direct_unmap_page+0x92
> backtrace really is in the swiotlb code (I can't think of anything else,
> but I'd rather be sure).
I'm not sure what you want me to confirm. Could you elaborate?

>
> Second it would be great to print what the contents of io_tlb_start
> and io_tlb_end are, e.g. by doing a printk_once in is_swiotlb_buffer,
> maybe that gives a clue why we are hitting the swiotlb code here.

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 7c007ed7505f..042246dbae00 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -69,6 +69,7 @@ extern phys_addr_t io_tlb_start, io_tlb_end;

 static inline bool is_swiotlb_buffer(phys_addr_t paddr)
 {
+    printk_once(KERN_INFO "io_tlb_start: %llu, io_tlb_end: %llu",
io_tlb_start, io_tlb_end);
     return paddr >= io_tlb_start && paddr < io_tlb_end;
 }

Result on boot:
[   11.405558] io_tlb_start: 3782983680, io_tlb_end: 3850092544

Regards,

Sibren