[Mesa-dev] a newbie asking newbie questions

Tue Sep 17 05:13:55 PDT 2013

Hello,

 Thank you for the very fast answers, some more questions:

> It's not a preference question.  The registers are 8 floats wide.
> Vertex shaders get invoked 2 vertices at a time, with a register containing these values:
>
> .   +------+------+------+------+------+------+------+------+
> .   | v0.x | v0.y | v0.z | v0.w | v1.x | v1.y | v1.z | v1.w |
> .   +------+------+------+------+------+------+------+------+

This seems best to me: run two vertices in each invocation with the hopes that the
shader compiler will merge (multiple) float, vec2 and maybe even vec3 operations into 
vec4 operations (does it)?

> while these 8 pixels in screen space:
> 
> .   +----+----+----+----+
> .   | p0 | p1 | p2 | p3 |
> .   +----+----+----+----+
> .   | p4 | p5 | p6 | p7 |
> .   +----+----+----+----+
>
> are loaded in fragment shader registers as:
>
> .   +------+------+------+------+------+------+------+------+
>.   | p0.x | p1.x | p4.x | p5.x | p2.x | p3.x | p6.x | p7.x |
> .   +------+------+------+------+------+------+------+------+
>
> Note how one register just holds a single channel ('.x' here) of a vector.  A vec4 would take up 4 registers, and to do value0.xyzw * value1.xyzw, you'd emit 4 MULs.

This is exactly what I was trying to ask/say about the fragment shader running, i.e. n-fragments are processed with 1 n-SIMD command (for i965, n=8),
sighs my e-mail communications leave something to be desired. 
Some questions:
 1) do the fragments need to be in a 4x2 block, or can it be two separate 2x2 blocks?
 2) for tiny triangles for fragment shaders that do not require dFdx, dFdy or fwidth, can the fragments be totally scattered?

Along further lines, for non-dependent texture lookups, are there code lines where the derivatives are computed
analytically so that selecting the correct LOD does not require to process fragments in 2x2 (or larger) blocks? Or does
the i965 hardware sampler interface does not allow this kind of madness? 

>> On a related note, where are the beans about the dispatch table?
>I don't know this one (or particularly what you're asking, I guess).

Viewing docs/index.html, on the side panel "Developer Topics --> GL Dispatch" there is text (broken into sections "1. Complexity of GL Dispatch", "2. Overview of Mesa's Implementation"  and "3. Optimizations " describing how different GL contexts for the same hardware can do different things for the same GL function and that mesa has stubs which in turn call the "real"  function. The documents go on to talk about various ways the function tables are filled and accessed across separate threads. My questions are:
 0) is that information text still accurate? In particular, the directory src/glapi is gone from Mesa (atleast what I git cloned) and I thought that was the location of it.
 1) where/how does the i965 driver fill that table, if it exists?

Along similar lines, I see that some of the code in src/mesa/main performs various checks of various API calls and at times has some conditions dependent on what context type it is, which kind of contradicts the idea of different context have different dispatch tables [sort of, since the functions might just be the driver magick, where as the stub is validate and then call driver magick]. 

-Kevin
---------------------------------------------------------------------
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.