[Mesa-dev] [PATCH] r600g: Fix UMAD on Cayman

Martin Andersson g02maran at gmail.com
Fri Apr 12 12:36:11 PDT 2013


I have made some progress with this issue.

Vadim, I did as you suggested and tried to mimic the output from the
shader analyser
tool. I used your patch as a base and then tried various ways to see
what would work.
After many tries (and lockups) I did managed to get the
ext_transform_feedback/order
test to pass.

It is a very ugly patch but it should illustrate what the problem (and
potential solution) is.

Your test program fails however because explicit break statements do
not work. It
should be possible to use the same code for the explicit breaks as for
the implicit
loop break.The reason it does not is that I detect the implicit break
with a hack and
it does notwork for explicit breaks.

The problem is that I need to detect the break statement when creating the
corresponding if statement. So that I can treat it differently than
other "regular" if
statements. Anyone knows how I could do that, or is this the wrong approach?

//Martin

On Thu, Apr 11, 2013 at 5:31 PM, Vadim Girlin <vadimgirlin at gmail.com> wrote:
> On 04/11/2013 02:08 AM, Marek Olšák wrote:
>>
>> Here's the output:
>>
>> creating vs ...
>> shader compilation status: OK
>> creating fs ...
>> shader compilation status: OK
>> thread #0 (0;0) : ref = 16608
>> thread #1 (1;0) : ref = 27873
>> thread #2 (0;1) : ref = 16608
>> thread #3 (1;1) : ref = 27877
>> results:
>>   thread 0 (0, 0): expected = 16608, observed = 27876, FAIL
>>   thread 1 (1, 0): expected = 27873, observed = 27873, OK
>>   thread 2 (0, 1): expected = 16608, observed = 27876, FAIL
>>   thread 3 (1, 1): expected = 27877, observed = 27877, OK
>>
>
> Thanks. According to these results, it looks like LOOP_START_DX10 for inner
> loop somehow reactivates the threads that were put into inactive-break state
> by the LOOP_BREAK in the outer loop. Also it seems LOOP_BREAK in the inner
> loop doesn't work as expected in this case. In other words, it looks weird.
>
> I can't explain why would this happen. It might be interesting to run these
> tests with llvm backend to see if there are any differences.
>
> Probably it might help if we'll implement LOOP_BREAK via EXECUTE_MASK_OP in
> the PRED_SET encoding as in my earlier patch, but without any stack push/pop
> operations and jumps (where it's possible), closer to what the catalyst
> (shader analyzer) does. I'm not sure if it will help though, and anyway
> we'll need stack operations in some cases, so I'm afraid this won't fix the
> issue completely.
>
> So far I have no other ideas.
>
> Vadim
>
>> Marek
>>
>>
>> On Wed, Apr 10, 2013 at 11:42 PM, Vadim Girlin
>> <vadimgirlin at gmail.com>wrote:
>>
>>> On 04/10/2013 01:53 PM, Marek Olšák wrote:
>>>
>>>> glsl-fs-loop-nested passes here.
>>>>
>>>> nstack is 3 and adding 4 to it doesn't help.
>>>>
>>>
>>> Ok, thanks.
>>>
>>> Also I wrote a simple test app that should reproduce the issue if it's
>>> really related to diverging control flow with nested loops and might more
>>> information about what's going wrong.
>>>
>>> The source is in the attachment and needs to be compiled with -lGL -lglut
>>> -lGLEW. The app renders four points and computes some value for each
>>> point
>>> in the loops similar to the transform feedback order test, but it doesn't
>>> use tfb. It should render four green or red squares depending on
>>> correctness of the result.
>>>
>>> Here is the correct output produced for me on evergreen:
>>>
>>>   thread 0 (0, 0): expected = 16608, observed = 16608, OK
>>>   thread 1 (1, 0): expected = 27873, observed = 27873, OK
>>>   thread 2 (0, 1): expected = 16608, observed = 16608, OK
>>>   thread 3 (1, 1): expected = 27877, observed = 27877, OK
>>>
>>> Please post the output if it fails on cayman.
>>>
>>> Vadim
>>>
>>>
>>>
>>>> Marek
>>>>
>>>>
>>>> On Wed, Apr 10, 2013 at 8:46 AM, Vadim Girlin <vadimgirlin at gmail.com>
>>>> wrote:
>>>>
>>>>   On 04/10/2013 03:58 AM, Marek Olšák wrote:
>>>>>
>>>>>
>>>>>   Hi Vadim,
>>>>>>
>>>>>>
>>>>>> your patch does not fix the test.
>>>>>>
>>>>>>
>>>>> Hmm, I'm out of ideas then. Thanks for testing.
>>>>>
>>>>> I've checked the shader dump few times but I don't see anything
>>>>> obviously
>>>>> wrong there, and the same code (except the minor ALU grouping changes
>>>>> due
>>>>> to the VLIW4/VLIW5 difference) works fine for me on evergreen.
>>>>>
>>>>> According to the Martin's observations it looks like if the threads
>>>>> that
>>>>> shouldn't execute the loop body were incorrectly left in the active
>>>>> state.
>>>>> LOOP_BREAK should put them into the inactive-break state, but something
>>>>> goes wrong. Do the other piglit tests with nested loops (e.g.
>>>>> glsl-fs-loop-nested) work on cayman? Though possibly there are no other
>>>>> tests with the diverging loops as in this case.
>>>>>
>>>>> I'll try to write a simpler test with the diverging loops to see if the
>>>>> issue is really caused by the incorrect control flow handling, and to
>>>>> figure out the exact instruction that results in the incorrect active
>>>>> state.
>>>>>
>>>>> Also probably it worth checking if the stack size is correct for that
>>>>> shader (latest mesa should print nstack value in the shader disassemble
>>>>> header, I think it should be 3 for that shader) and maybe try adding
>>>>> some
>>>>> constant, e.g. 4 to the bc->nstack in the r600_bytecode_build just to
>>>>> be
>>>>> sure that we reserve enough of stack space, though I don't think stack
>>>>> size
>>>>> is the cause of this issue.
>>>>>
>>>>> Vadim
>>>>>
>>>>>
>>>>>
>>>>>   Marek
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 9, 2013 at 11:30 PM, Vadim Girlin <vadimgirlin at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>    On 04/09/2013 10:58 AM, Martin Andersson wrote:
>>>>>>
>>>>>>>
>>>>>>>    On Tue, Apr 9, 2013 at 3:18 AM, Marek Olšák <maraeo at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>    Pushed, thanks. The transform feedback test still doesn't pass,
>>>>>>>> but
>>>>>>>> at
>>>>>>>>
>>>>>>>>> least
>>>>>>>>> the hardlocks are gone.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   Thanks, I have looked into the other issue as well
>>>>>>>>
>>>>>>>>
>>>>>>>> http://lists.freedesktop.org/******archives/mesa-dev/2013-**March/**<http://lists.freedesktop.org/****archives/mesa-dev/2013-March/**>
>>>>>>>> **036941.html<http://lists.**freedesktop.org/**archives/**
>>>>>>>>
>>>>>>>> mesa-dev/2013-March/**036941.**html<http://lists.freedesktop.org/**archives/mesa-dev/2013-March/**036941.html>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> <http://lists.**freedesktop.**org/archives/mesa-**<http://freedesktop.org/archives/mesa-**>
>>>>>>>> dev/2013-March/036941.html<htt**p://lists.freedesktop.org/**
>>>>>>>>
>>>>>>>> archives/mesa-dev/2013-March/**036941.html<http://lists.freedesktop.org/archives/mesa-dev/2013-March/036941.html>
>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> The problem arises when there are nested loops. If I rework the code
>>>>>>>> so there are
>>>>>>>> no nested loops the issue disappears. At least one pixel also needs
>>>>>>>> to
>>>>>>>> enter the
>>>>>>>> outer loop. The pixels that should enter the outer loop behaves
>>>>>>>> correctly. It is those
>>>>>>>> pixels that should not enter the outer loop that misbehaves. It does
>>>>>>>> not matter if they
>>>>>>>> also fails the test for the inner loop, they will still execute the
>>>>>>>> instruction inside. That
>>>>>>>> leads to the strange results for that test.
>>>>>>>>
>>>>>>>>
>>>>>>>>   Please test the attached patch.
>>>>>>>
>>>>>>>
>>>>>>> Vadim
>>>>>>>
>>>>>>>
>>>>>>>    The strangeness is easier to see if the NUM_POINTS in the
>>>>>>>
>>>>>>>> ext_transform_feedback/
>>>>>>>> order.c are run with smaller values,like 3, 6 and 9. Disable the
>>>>>>>> code
>>>>>>>> that fail the test
>>>>>>>> and print starting_x, shift_reg_final and iteration_count.
>>>>>>>>
>>>>>>>> Marek, since you implemented transform feedback for r600, do you
>>>>>>>> think
>>>>>>>> the issue
>>>>>>>> is with the tranform feedback code or the shader compiler or some
>>>>>>>> other
>>>>>>>> thing?
>>>>>>>>
>>>>>>>> //Martin
>>>>>>>> ______________________________******_________________
>>>>>>>> mesa-dev mailing list
>>>>>>>> mesa-dev at lists.freedesktop.org
>>>>>>>>
>>>>>>>> http://lists.freedesktop.org/******mailman/listinfo/mesa-dev<http://lists.freedesktop.org/****mailman/listinfo/mesa-dev>
>>>>>>>>
>>>>>>>> <h**ttp://lists.freedesktop.org/****mailman/listinfo/mesa-dev<http://lists.freedesktop.org/**mailman/listinfo/mesa-dev>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> <htt**p://lists.freedesktop.**org/**mailman/listinfo/mesa-**dev<http://lists.freedesktop.org/**mailman/listinfo/mesa-dev>
>>>>>>>>
>>>>>>>> <http://lists.freedesktop.**org/mailman/listinfo/mesa-dev<http://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>   ______________________________****_________________
>>>>>>>
>>>>>>> mesa-dev mailing list
>>>>>>> mesa-dev at lists.freedesktop.org
>>>>>>>
>>>>>>> http://lists.freedesktop.org/****mailman/listinfo/mesa-dev<http://lists.freedesktop.org/**mailman/listinfo/mesa-dev>
>>>>>>>
>>>>>>> <htt**p://lists.freedesktop.org/**mailman/listinfo/mesa-dev<http://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-r600g-HACK-Fix-nested-loops-on-Cayman.patch
Type: application/octet-stream
Size: 6584 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130412/cd225f2f/attachment.obj>


More information about the mesa-dev mailing list