[Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

Thu Jul 2 09:40:05 PDT 2015

On 02/07/15 17:24, Ilia Mirkin wrote:
> On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca <jfonseca at vmware.com> wrote:
>> On 02/07/15 17:08, Ilia Mirkin wrote:
>>>
>>> On Thu, Jul 2, 2015 at 11:57 AM, Jose Fonseca <jfonseca at vmware.com> wrote:
>>>>
>>>> On 02/07/15 16:34, Ilia Mirkin wrote:
>>>>>
>>>>>
>>>>> On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca <jfonseca at vmware.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> On 01/07/15 22:30, bugzilla-daemon at freedesktop.org wrote:> *Comment #
>>>>>> 14
>>>>>> <https://bugs.freedesktop.org/show_bug.cgi?id=91173#c14>
>>>>>>>
>>>>>>>
>>>>>>> on bug 91173 <https://bugs.freedesktop.org/show_bug.cgi?id=91173> from
>>>>>>> Ilia Mirkin <mailto:imirkin at alum.mit.edu> *
>>>>>>>
>>>>>>> Erm... ok...
>>>>>>>
>>>>>>> MOV R0.zw, c[A0.x + 9];
>>>>>>> MOV R1.x, c[0].w;
>>>>>>> ADD R0.x, c[A0.x + 9].y, R1;
>>>>>>> FLR R0.y, R0.x;
>>>>>>>
>>>>>>> vs
>>>>>>>
>>>>>>>       0: MAD TEMP[0].xy, IN[1], CONST[7].yyyy, CONST[7].xxxx
>>>>>>>       3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
>>>>>>>       7: FLR TEMP[0].y, CONST[0].wwww
>>>>>>>
>>>>>>> Could be that I'm matching the wrong shaders. But this seems highly
>>>>>>> suspect.
>>>>>>> Need to see if there's a good way of dumping mesa ir... I wonder if it
>>>>>>> doesn't
>>>>>>> notice the write-mask on the MOV R0.zw and thinks that R0 contains the
>>>>>>> value it
>>>>>>> wants.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Nice detective work on this bug, Ilia.
>>>>>>
>>>>>>> Could be that I'm matching the wrong shaders.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I think it could be quite useful if there was a
>>>>>> "GL_MESAX_get_internal_representation" Mesa specific extension to
>>>>>> extract
>>>>>> a
>>>>>> text representation of the current bound GLSL, TGSI, hardware speicfic,
>>>>>> etc,
>>>>>> exclusively for debugging purposes.
>>>>>>
>>>>>> It doesn't even need to be advertised on non-debug builds of Mesa.  But
>>>>>> merely being able to see next to each other all the IRs at a given call
>>>>>> in a
>>>>>> trace, will probably save some time / grief for us developers on
>>>>>> similar
>>>>>> situations.
>>>>>>
>>>>>>
>>>>>> I did something akin to this for NVIDIA prioprietary drivers on
>>>>>>
>>>>>>
>>>>>> https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
>>>>>> but I don't think GetProgramBinary is apropriate for Mesa (only one
>>>>>> format.)
>>>>>>
>>>>>>
>>>>>> Instead, for Mesa we could have something like
>>>>>>
>>>>>>       GLint n;
>>>>>>       // this will trigget IRs being collected into an array internally
>>>>>>       glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);
>>>>>>
>>>>>>       for (i=0; i < n; ++i) {
>>>>>>           GLint nameLength;
>>>>>>           char *name;
>>>>>>           GLint sourceLength;
>>>>>>           char *source;
>>>>>>           glGetActiveInternalRepr(&nameLength, NULL, &sourceLength,
>>>>>> NULL);
>>>>>>           name = malloc(nameLength)
>>>>>>           source = malloc(sourceLength)
>>>>>>           glGetActiveInternalRepr(NULL, name, NULL, source);
>>>>>>       }
>>>>>>
>>>>>> And this would need to be plumbed through all the way inside the
>>>>>> drivers,
>>>>>> each layer would  advertise additional IRs.
>>>>>>
>>>>>> And the information here would only be obtainable/valid immediately
>>>>>> after
>>>>>> a
>>>>>> draw call.
>>>>>>
>>>>>>
>>>>>> A completely different tack, is that apitrace's glretrace would
>>>>>> advertise
>>>>>> an
>>>>>> unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
>>>>>> drivers/layers would write shaders repres, and when they are
>>>>>> bound/unbound/destroyed on  a preestablished format:
>>>>>>
>>>>>> CREATE "GLSL/123"
>>>>>> ...
>>>>>> EOF
>>>>>>
>>>>>> CREATE TGSI/456
>>>>>> EOF
>>>>>>
>>>>>> BIND GLSL/123
>>>>>> BIND TGSI/456
>>>>>> BIND HW/789
>>>>>>
>>>>>> UNBIND GLSL/123
>>>>>> UNBIND TGSI/456
>>>>>> UNBIND HW/789
>>>>>>
>>>>>> DESTROY GLSL/123
>>>>>> DESTROY TGSI/456
>>>>>> DESTROY HW/789
>>>>>>
>>>>>>
>>>>>> I don't feel strongly either way, but I suspect that having a proper
>>>>>> extension, even if a little more work at start, will be more robust on
>>>>>> the
>>>>>> long term.  And less runtime overhead.  GL extensions also give a
>>>>>> mechanism
>>>>>> to revise/deprecate this functionality in the future.
>>>>>
>>>>>
>>>>>
>>>>> This would still require fairly extensive changes as you'd have to
>>>>> track all the bindings together.
>>>>
>>>>
>>>>
>>>> Really? I don't think so.  Which alternative are you referring to?
>>>
>>>
>>> The MESA_IR_DUMP_ALL=fd thing. You can't just have a single ID for the
>>> TGSI/HW as it might change based on other states. By the time you get
>>> it sufficiently robust, you might as well do the GL extension.
>>>
>>>>
>>>> Yet another option would be to provide a callback
>>>>
>>>>     typedef void (*GLircallbackMESA)(const char *name, const char *body);
>>>>
>>>>     void glGetActiveInternalReprMesa(GLircallbackMESA callback);
>>>>
>>>> and basically each layer would dump the IRs, and invoke the downstream
>>>> layers with the same callback.
>>>
>>>
>>> What "name" would the driver supply here? And how would you link
>>> things up together?
>>
>>
>> Giving llvmpipe example, which I'm more familiar,
>>
>>   - src/mesa/state_tracKer would invoke with "state_tracker/tgsi/{vs,fs}"
>> and "glsl-ir/{vs,fs}"
>>   - and invoke pipe_context::get_active_ir (callback) if the pipe driver
>> implements it
>>   - src/gallium/drivers/llvmpipe would invoke with
>>     - "llvmpipe/tgsi/{vs,fs}" (which might differ from the state tracker due
>> to draw module
>>     - "llvmpipe/llvm/{vs,fs,setup}_{full,partial}"
>>     - and maybe even "llvmpipe/x86/{vs,fs}
>>
>> The idea is that this glGetActiveInternalReprMesa() call dumps what's active
>> _now_, which is only makes sense immediately after draw calls. So the only
>> thing the drivers need to do is dump what they see bound.
>
> Ah OK. So I guess tilers will have to disable their render queues for
> this one. Which seems like a reasonable trade-off...

I don't see why.

This is a purely SW query. So I don't see why the HW needs to see any 
difference.

That said, glretrace already does glReadPixels when dumping state, so 
one way or the other, when inspecting state in qapitrace, everything 
will be flushed and and synched.

Jose