[Intel-gfx] Failure with swiotlb

Eric Anholt eric at anholt.net
Mon Jan 4 22:11:56 CET 2010


On Mon, 4 Jan 2010 17:27:45 +0800, Zhenyu Wang <zhenyuw at linux.intel.com> wrote:
> On 2009.12.31 12:33:06 +0800, Zhenyu Wang wrote:
> > On 2009.12.30 10:26:27 +0000, David Woodhouse wrote:
> > > On Wed, 2009-12-30 at 11:02 +0800, Zhenyu Wang wrote:
> > > > We have .31->.32 regression as reported in
> > > > http://bugs.freedesktop.org/show_bug.cgi?id=25690
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=14627
> > > > 
> > > > It's triggered on non VT-d machine (or machine that should have VT-d,
> > > > but no way to turn it on in BIOS.) and with large memory, and swiotlb
> > > > is used for PCI dma ops. swiotlb uses a bounce buffer to copy between
> > > > CPU pages and real pages made for DMA, but we can't make it real coherent
> > > > as we don't call pci_dma_sync_single_for_cpu() alike APIs. And in GEM
> > > > domain change, we also can't flush pages for bounce buffer. It looks like
> > > > our usual non-cache-coherent graphics device can't love swiotlb. 
> > > > 
> > > > This patch trys to only handle pci dma mapping in case of real iommu
> > > > hardware detected, the only case for that is VT-d. And fallback to origin
> > > > method to insert physical page directly in other case. This fixes the
> > > > GPU hang on our Q965 with 8G memory in 64-bit OS. Comments?
> > > 
> > > I don't understand. Why is swiotlb doing anything here anyway, when the
> > > device has a dma_mask of 36 bits?
> > > 
> > > Shouldn't dma_capable() return 1, causing swiotlb_map_page() to return
> > > the original address unmangled?
> > 
> > Good point, I didn't look into swiotlb code, coz my debug showed  it returned
> > mangled dma address. So looks the real problem is 36 bit dma mask got corrupted
> > somehow, which matches first report in fd.o bug 25690.
> > 
> > Looks we should setup dma mask in drm/i915 driver too, as they both operate on
> > graphics device. But I can't test that on our 8G mem machine until after new year.
> > 
> 
> Finally caught it! It's within drm_pci_alloc() which will try to setup dma mask
> for pci_dev again! That is used for physical address based hardware status page
> for 965G (i915_init_phys_hws()), as alloc with pci coherent interface. But trying
> to set mask again in an alloc function looks wrong to me, and driver should setup
> their own consistent dma mask according to hw. 
> 
> So following patch trys to remove mask setting in drm_pci_alloc(), which fixed
> the origin problem as dma mask now has the right 36bit setting on intel hw. I
> can't test if ati bits looks correct, Dave?
> 
> As intel hws page does support 36bit physical address, that will be another patch
> for setup pci consistent 36bit mask for it. Any comment?

Looks like this patch doesn't set the dma mask that used to get set for
the drivers that were relying on it.  Once all the drivers are fixed to
set it up at load time, this seems like a good interface fix.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20100104/43c32e88/attachment.sig>


More information about the Intel-gfx mailing list