Radeon R700 multi-ring bug

Alex Deucher alexdeucher at gmail.com
Sat Apr 19 08:19:42 PDT 2014


On Sat, Apr 19, 2014 at 11:07 AM, Marek Olšák <maraeo at gmail.com> wrote:
> This test always reproduces the issue for me:
>
> piglit/bin/arb_vertex_buffer_object-vbo-subdata-many drawarrays -fbo -auto
>
> There are rejected IBs and it hangs sometimes.
>
> It started to fail with this commit:
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=6d434252e239bc872549e59c64eb3d0e5dab0655
>
> which is probably unrelated to the issue, but it makes the graphics IB
> a little bit bigger.
>
> Also, I think R700 is generally in a bad shape. I haven't been able to
> run piglit with concurrency and without hangs, and I have already
> disabled async DMA, geometry shaders, and pipelined buffer uploads.

See if disabling dpm helps.

Alex

>
> Marek
>
> On Sat, Apr 19, 2014 at 11:54 AM, Christian König
> <deathsimple at vodafone.de> wrote:
>> Hi Marek,
>>
>> I've noticed this before as well, and I agree that it looks like a memory
>> corruption. Not sure if the async DMA on the GPU or the CPU is overwriting
>> something because of a race condition or something like this.
>>
>> Anyway, can you come up with a simple test case to reproduce the issue? For
>> me it occurred only randomly while working on UVD support for R7xx. If you
>> have something more reliable I could dig into it with my RV710.
>>
>> Christian.
>>
>> Am 19.04.2014 01:48, schrieb Marek Olšák:
>>>
>>> Hi,
>>>
>>> If you submit a lot of graphics and DMA IBs interleaved, the graphics
>>> CS checker sometimes fails with this message:
>>>
>>> [ 3846.435661] Forbidden register 0x0014 in cs at 9
>>> [ 3846.435664] [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
>>>
>>> This error is only used for type-0 packets, but we don't use these
>>> packets on R700 at all. Somehow, the graphics CS checker received
>>> either the DMA IB or random garbage. My guess is there is memory
>>> corruption happening during IB uploading and/or IB checking in the
>>> kernel. Also, if you are unlucky, the GPU hangs instead.
>>>
>>> The CS thread offloading was disabled in Mesa, so the user space was
>>> single-threaded.
>>>
>>> There are 2 ways to fix this:
>>> - disable async DMA in Mesa
>>> - call usleep(1) after the RADEON_CS ioctl returns
>>>
>>> This is just a heads-up. In the worst case, we can disable async DMA
>>> for R700 in Mesa.
>>>
>>> Marek
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


More information about the dri-devel mailing list