drm/radeon: "ring test failed" on PA-RISC Linux

Fri Sep 20 14:27:07 PDT 2013

On Tue, Sep 17, 2013 at 3:33 PM, Alex Ivanov <gnidorah at p0n4ik.tk> wrote:
> 17.09.2013, в 18:24, Alex Deucher <alexdeucher at gmail.com> написал(а):
>
>> On Tue, Sep 17, 2013 at 5:23 AM, Alex Ivanov <gnidorah at p0n4ik.tk> wrote:
>>> Alex,
>>>
>>> 10.09.2013, в 16:37, Alex Deucher <alexdeucher at gmail.com> написал(а):
>>>
>>>> The dummy page isn't really going to help much.  That page is just
>>>> used as a safety placeholder for gart entries that aren't mapped on
>>>> the GPU.  TTM (drivers/gpu/drm/ttm) actually does the allocation of
>>>> the backing pages for the gart.
>>>
>>>> You may want to look there.
>>>
>>> Ah, sorry. Indeed. Though, my idea with:
>>>
>>> On Tue, Sep 10, 2013 at 5:20 AM, Alex Ivanov <gnidorah at p0n4ik.tk> wrote:
>>>
>>>> Thanks! I'll try. Meanwhile i've tried a switch from page_alloc() to
>>>> dma_alloc_coherent() in radeon_dummy_page_*(), which didn't help :(
>>>
>>> doesn't make a sense at TTM part as well.
>>
>> After the driver is loaded, you can dump some info from debugfs:
>> r100_rbbm_info
>> r100_cp_ring_info
>> r100_cp_csq_fifo
>> Which will dump a bunch of registers and internal fifos so we can see
>> that the chip actually processed.
>>
>> Alex
>
> Reading of r100_cp_ring_info leads to a KP:
>
> r100_debugfs_cp_ring_info():
> count = (rdp + ring->ring_size - wdp) & ring->ptr_mask;
> i = (rdp + j) & ring->ptr_mask;
>
>         for (j = 0; j <= count; j++) {
>                 i = (rdp + j) & ring->ptr_mask;
>                 --> Here at first iteration <--
>                 --> count = 262080, i = 0 <--
>                 seq_printf(m, "r[%04d]=0x%08x\n", i, ring->ring[i]);
>         }
>
> Reading of radeon_ring_gfx (which i've additionally tried to read)
> throws an MCE:
>
> radeon_debugfs_ring_info():
> count = (ring->ring_size / 4) - ring->ring_free_dw;
> i = (ring->rptr + ring->ptr_mask + 1 - 32) & ring->ptr_mask;
>
>         for (j = 0; j <= (count + 32); j++) {
>                 --> Here at first iteration <--
>                 --> i = 262112, j = 0 <--
>                 seq_printf(m, "r[%5d]=0x%08x\n", i, ring->ring[i]);
>                 i = (i + 1) & ring->ptr_mask;
>         }
>
> I'm attaching debug outputs on kernel built with these loops commented.

The register writes seems to be going through the register backbone correctly:

[0x00B] 0x15E0=0x00000000
[0x00C] 0x15E4=0xCAFEDEAD
[0x00D] 0x4274=0x0000000F
[0x00E] 0x42C8=0x00000007
[0x00F] 0x4018=0x0000001D
[0x010] 0x170C=0x80000000
[0x011] 0x3428=0x00020100
[0x012] 0x15E4=0xCAFEDEAD

You can see the 0xCAFEDEAD written to the scratch register via MMIO
from the ring_test(). The CP fifo however seems to be full of garbage.
 The CP is busy though, so it seems to be functional.  I guess it's
just fetching garbage rather than commands.

Does doing a posted write when writing to the ring buffer help?

diff --git a/drivers/gpu/drm/radeon/radeon_ring.c
b/drivers/gpu/drm/radeon/radeon_ring.c
index a890756..b4f04d2 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -324,12 +324,14 @@ static int radeon_debugfs_ring_init(struct
radeon_device *rdev, struct radeon_ri
  */
 void radeon_ring_write(struct radeon_ring *ring, uint32_t v)
 {
+       u32 tmp;
 #if DRM_DEBUG_CODE
        if (ring->count_dw <= 0) {
                DRM_ERROR("radeon: writing more dwords to the ring
than expected!\n");
        }
 #endif
        ring->ring[ring->wptr++] = v;
+       tmp = ring->ring[ring->wptr - 1];
        ring->wptr &= ring->ptr_mask;
        ring->count_dw--;
        ring->ring_free_dw--;