[PATCH] drm/radeon: Disable writeback by default on ppc

Alex Deucher alexdeucher at gmail.com
Wed Dec 4 15:56:38 PST 2013


On Wed, Dec 4, 2013 at 5:16 PM, Kleber Sacilotto de Souza
<klebers at linux.vnet.ibm.com> wrote:
> On 11/25/2013 10:11 PM, Kleber Sacilotto de Souza wrote:
>>
>> On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote:
>>>
>>> On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote:
>>>>
>>>> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote:
>>>>>
>>>>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote:
>>>>>
>>>>>> Weird.  I wonder if there is an issue with cache snoops on PPC.  We
>>>>>> currently use the gart in cached mode (GPU snoops CPU cache) with
>>>>>> cached pages.  I wonder if we need to use uncached pages on PPC.
>>>>>
>>>>> There is no such issue and no known bugs with DMA writes on those
>>>>> PCIe host bridges (and they do get hammered pretty bad here).
>>>>>
>>>>> This needs further investigation by the lab/hw guys to find out what's
>>>>> actually happening on the bus and the host bridge.
>>>>>
>>>>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that
>>>>> continuously writes to a spare scratch register (thus triggering the
>>>>> corresponding writeback DMA) and checks the memory location to compare
>>>>> the writeback value (using a debugfs file for example, or mmap).
>
>
> I was not able to reproduce the issue with this method, even after a weekend
> run.
>
> However, doing some more investigation it seems the problem is here, where
> we read the ring rptr:
>
> u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev,
>                                  struct radeon_ring *ring)
> {
>         u32 rptr;
>
>         if (rdev->wb.enabled)
>                 rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]);
>         else
>                 rptr = RREG32(ring->rptr_reg);
>
>         return rptr;
> }
>
> I realized that the DMA'ed rptr value has always the opposite byte order
> from the MMIO value. Since RREG32 already returns the register value on the
> CPU byte order, it seems we don't need to byte-swap the DMA'ed value. If I
> remove the le32_to_cpu() call and use the DMA'ed value directly, I don't get
> the IB scheduling failures and piglit results are the same as with writeback
> disabled.
>
> Is the adapter chipset swapping the bytes before doing the DMA to a
> big-endian host?

Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte
swapping for just about everything accessed by the CP (rptr writeback,
indirect buffers, etc.).  Looks like the DMA ring supports and enables
rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so
I think we can drop the swapping of the rptr writeback.

Alex

>
>
>
> --
> Kleber Sacilotto de Souza
> IBM Linux Technology Center
>


More information about the dri-devel mailing list