drm/radeon: "ring test failed" on PA-RISC Linux

Mon Sep 9 10:43:01 PDT 2013

On Mon, Sep 9, 2013 at 12:44 PM, Alex Ivanov <gnidorah at p0n4ik.tk> wrote:
> Folks,
>
> We (people at linux-parisc @ vger.kernel.org mail list) are trying to make
> native video options of the latest PA-RISC servers and workstations
> (these are ATIs, most of which are based on R100/R300/R420 chips) work
> correctly on this platform (big endian pa-risc).
>
> However, we hadn't much success. DRM fails every time with
> "ring test failed" for both AGP & PCI.
>
> Maybe you would give us some suggestions that we could check?
>
> Topic started here:
> http://www.spinics.net/lists/linux-parisc/msg04908.html
> And continued there:
> http://www.spinics.net/lists/linux-parisc/msg04995.html
> http://www.spinics.net/lists/linux-parisc/msg05006.html
>
> Problems we've already resolved without any signs of progress:
> - Checked the successful microcode load
> "parisc AGP GART code writes IOMMU entries in the wrong byte order and
>  doesn't add the coherency information SBA code adds"
> "our PCI BAR setup doesn't really work very well together with the Radeon
>  DRM address setup. DRM will generate addresses, which are even outside
>  of the connected LBA"
>
> Things planned for a check:
> "The drivers/video/aty uses
> an endian config bit DRM doesn't use, but I haven't tested whether
> this makes a difference and how it is connected to the overall picture."

I don't think that will any difference.  radeon kms works fine on
other big endian platforms such as powerpc.

>
> "The Rage128 product revealed a weakness in some motherboard
> chipsets in that there is no mechanism to guarantee
> that data written by the CPU to memory is actually in a readable
> state before the Graphics Controller receives an
> update to its copy of the Write Pointer. In an effort to alleviate this
> problem, we"ve introduced a mechanism into the
> Graphics Controller that will delay the actual write to the Write Pointer
> for some programmable amount of time, in
> order to give the chipset time to flush its internal write buffers to
> memory.
> There are two register fields that control this mechanism:
> PRE_WRITE_TIMER and PRE_WRITE_LIMIT.
>
> In the radeon DRM codebase I didn't found anyone using/setting
> those registers. Maybe PA-RISC has some problem here?..."

I doubt it.  If you are using AGP, I'd suggest disabling it and first
try to get things working using the on chip gart rather than AGP.
Load radeon with agpmode=-1.  The on chip gart always uses cache
snooped pci transactions and the driver assumes pci is cache coherent.
 On AGP/PCI chips, the on-chip gart mechanism stores the gart table in
system ram.  On PCIE asics, the gart table is stored in vram.  The
gart page table maps system pages to a contiguous aperture in the
GPU's address space.  The ring lives in gart memory.  The GPU sees a
contiguous buffer and the gart mechanism handles the access to the
backing pages via the page table.  I'd suggest verifying that the
entries written to the gart page table are valid and then the
information written to the ring buffer is valid before updating the
ring's wptr in radeon_ring_unlock_commit().  Changing the wptr is what
causes the CP to start fetching data from the ring.

Alex

>
> Thanks.
>
> -------- Пересылаемое сообщение  --------
> 04.08.2013, 15:06, "Alex Ivanov" <gnidorah at p0n4ik.tk>:
>
> 11.07.2013, 23:48, "Helge Deller" <deller at gmx.de>:
>
>>  adding linux parisc mailing list...:
>>
>>  On 07/11/2013 09:46 PM, Helge Deller wrote:
>>>   On 07/10/2013 11:29 PM, Alex Ivanov wrote:
>>>>   11.07.2013, 01:14, "Matt Turner" <mattst88 at gmail.com>:
>>>>>   On Wed, Jul 10, 2013 at 1:19 PM, Alex Ivanov <gnidorah at p0n4ik.tk> wrote:
>>>>>>    Thank you so much! Your guess looks to be right. After applying of your
>>>>>>    patch there was no more KP and X just worked.
>>>>>   Nice! Does DRI work?
>>>>   Not on my side. Plus i can't visually jump over 8bit depth, although Xorg
>>>>   states 24bit in it's log.
>>>>   As for DRI, i'm experiencing
>>>>   "ring test failed (scratch(0x15E4)=0xCAFEDEAD)" with a firegl x3.
>>>   FWIW, I'm seeing the same failure on my FireGL X1:
>>>   80:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Radeon R300 NG [FireGL X1] (rev 80)
>>>
>>>   [drm] radeon: irq initialized.
>>>   [drm] Loading R300 Microcode
>>>   [drm] radeon: ring at 0x0000000060001000
>>>   [drm:r100_ring_test] *ERROR* radeon: ring test failed (scratch(0x15E4)=0xCAFEDEAD)
>>>   [drm:r100_cp_init] *ERROR* radeon: cp isn't working (-22).
>>>   radeon 0000:80:00.0: failed initializing CP (-22).
>>>   radeon 0000:80:00.0: Disabling GPU acceleration
>>>   [drm:r100_cp_fini] *ERROR* Wait for CP idle timeout, shutting down CP.
>>>   [drm] radeon: cp finalized
>>>   [drm] radeon: cp finalized
>
> I still have no clue why this happens. Broken SBA IOMMU / DRM code? Missing syncing primitives?
> Should we forward this to dri-devel mail list?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> -------- Завершение пересылаемого сообщения --------
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel