[Mesa-dev] [PATCH] [rfc] r600: set vpm bit for loop start clause

Dave Airlie airlied at gmail.com
Tue Aug 7 02:34:55 UTC 2018


On 5 August 2018 at 19:47, Gert Wollny <gw.fossdev at gmail.com> wrote:
> Am Freitag, den 03.08.2018, 06:02 +0200 schrieb Roland Scheidegger:
>> Am 03.08.2018 um 05:10 schrieb Dave Airlie:
>> > From: Dave Airlie <airlied at redhat.com>
>> >
>> > This fixes some hangs with the arb_shader_image_load_store-
>> > atomicity tests
>> > on evergreen/cayman GPUs.
>> >
>> > I'm not 100% sure why (VPM hurts my brain), I'm running some piglit
>> > runs to see if it has any bad side effects.
>> > ---
>> >  src/gallium/drivers/r600/r600_shader.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/src/gallium/drivers/r600/r600_shader.c
>> > b/src/gallium/drivers/r600/r600_shader.c
>> > index 2229dc8fab3..1341bf2c48c 100644
>> > --- a/src/gallium/drivers/r600/r600_shader.c
>> > +++ b/src/gallium/drivers/r600/r600_shader.c
>> > @@ -10587,7 +10587,7 @@ static int tgsi_bgnloop(struct
>> > r600_shader_ctx *ctx)
>> >     /* LOOP_START_DX10 ignores the LOOP_CONFIG* registers, so
>> > it is not
>> >      * limited to 4096 iterations, like the other LOOP_*
>> > instructions. */
>> >     r600_bytecode_add_cfinst(ctx->bc, CF_OP_LOOP_START_DX10);
>> > -
>> > +   ctx->bc->cf_last->vpm = 1;
>> >     fc_pushlevel(ctx, FC_LOOP);
>> >
>> >     /* check stack depth */
>> >
>>
>> I think you might get incorrect derivatives afterwards (if some
>> pixels in a quad were invalid)? Or are the lanes in the alu still
>> active (I
>> don't really understand all the vpm stuff...) afterwards?
>>
>> I'm wondering if the tests are just bad? You can get lockups rather
>> easily if you're not extra careful with atomics in loops (because the
>> result of the atomic is undefined for helper invocations, and the hw
>> will never store the result).
>> There was recently some discussion about this on mesa-dev, also see
>> https://bugs.freedesktop.org/show_bug.cgi?id=106902
>
> I might add that I've also played around to get rid of these locks, at
> least to be able to run the gpu test set completely, so I did one
> thing, I limited the number of iterations by using an extra counter,
> and with that this particular test passed, which means that some
> instance of the shader is really running the loop with bogous data.
> This patch fixed the unmodified piglit for me.
>
> Another piglit,
>
> arb_shader_storage_buffer_object/
>    execution/ssbo-atomicCompSwap-int.shader_test
>
> still hangs for some time (on barts), but it has an extra protection
> against helper invocations. This test also fails when I put in such a
> savety counter to break the loop after 1000 iterations,  so I guess the
> failure is geniue, but failing this test should not result in a GPU
> lockup. I guess tests that might lockup a GPU are needed to be able to
> debug problems, but IMHO they should not be part of some of these
> general test sets.
>
> Because of these lockups I have no data to compare older runs. With
> this patch and the ssbo test fixed up I got
>
> [26236/26236] skip: 10688, pass: 15118, warn: 13, fail: 412, crash: 5
>
> for the full gpu set on BARTS in single thread mode, on CEDAR I got
> some other, unrelated  GPU lockups.
>
> Looking only at the texture piglits that failed, there were no
> regressions with respect to this patch.
>
> When running piglit multithreaded I still get lockups that bring down
> the graphics card so that it was no longer fully usable and a cold
> restart is needed. In these cases various piglits that run tests with
> atomics are running in parallel, and notable the process running
> arb_compute_shader-local-id could not be killed a all.

I might have sent a patch to hopefully handle deadlock with parallel
atomic counter tests


https://patchwork.freedesktop.org/series/47787/


>
> In any case, this patch looks like an enhacement to me, because it
> makes lockups less likely, and I didn't see regressions, so
>   Tested-By: Gert Wollny <gw.fossdev at gmail.com>

Thanks, I'm probably going to revise this so only certain loops get vpm
on them,

Dave.


More information about the mesa-dev mailing list