[Mesa-dev] [Mesa3d-dev] ARB draw buffers + texenv program

Wed Apr 14 14:05:21 PDT 2010

Moving this to the new mesa-dev list...

Roland Scheidegger wrote:
> On 14.04.2010 00:38, Dave Airlie wrote:
>> On Wed, Apr 14, 2010 at 8:33 AM, Roland Scheidegger <sroland at vmware.com> wrote:
>>> On 13.04.2010 20:28, Alex Deucher wrote:
>>>> On Tue, Apr 13, 2010 at 2:21 PM, Corbin Simpson
>>>> <mostawesomedude at gmail.com> wrote:
>>>>> On Tue, Apr 13, 2010 at 6:42 AM, Roland Scheidegger <sroland at vmware.com> wrote:
>>>>>> On 13.04.2010 02:52, Dave Airlie wrote:
>>>>>>> On Tue, Apr 6, 2010 at 2:00 AM, Brian Paul <brianp at vmware.com> wrote:
>>>>>>>> Dave Airlie wrote:
>>>>>>>>> Just going down the r300g piglit failures and noticed fbo-drawbuffers
>>>>>>>>> failed, I've no idea
>>>>>>>>> if this passes on Intel hw, but it appears the texenvprogram really
>>>>>>>>> needs to understand the
>>>>>>>>> draw buffers. The attached patch fixes it here for me on r300g anyone
>>>>>>>>> want to test this on Intel
>>>>>>>>> with the piglit test before/after?
>>>>>>>> The piglit test passes as-is with Mesa/swrast and NVIDIA.
>>>>>>>>
>>>>>>>> It fails with gallium/softpipe both with and w/out your patch.
>>>>>>>>
>>>>>>>> I think that your patch is on the right track.  But multiple render targets
>>>>>>>> are still a bit of an untested area in the st/mesa code.
>>>>>>>>
>>>>>>>> One thing: the patch introduces a dependency on buffer state in the
>>>>>>>> texenvprogram code so in state.c we should check for the _NEW_BUFFERS flag.
>>>>>>>>
>>>>>>>> Otherwise, I'd like to debug the softpipe failure a bit further to see
>>>>>>>> what's going on.  Perhaps you could hold off on committing this for a bit...
>>>>>>> Well Eric pointed out to me the fun line in the spec
>>>>>>>
>>>>>>> (3) Should gl_FragColor be aliased to gl_FragData[0]?
>>>>>>>
>>>>>>>       RESOLUTION: No.  A shader should write either gl_FragColor, or
>>>>>>>       gl_FragData[n], but not both.
>>>>>>>
>>>>>>>       Writing to gl_FragColor will write to all draw buffers specified
>>>>>>>       with DrawBuffersARB.
>>>>>>>
>>>>>>> So I was really just masking the issue with this. From what I can see
>>>>>>> softpipe messes up and I'm not sure where we should be fixing this.
>>>>>>> swrast does okay, its just whether we should be doing something in gallium
>>>>>>> or in the drivers is open.
>>>>>> Hmm yes looks like that's not really well defined. I guess there are
>>>>>> several options here:
>>>>>> 1) don't do anything at the state tracker level, and assume that if a
>>>>>> fragment shader only writes to color 0 but has several color buffers
>>>>>> bound the color is meant to go to all outputs. Looks like that's what
>>>>>> nv50 is doing today. If a shader writes to FragData[0] but not others,
>>>>>> in gallium that would mean that output still gets replicated to all
>>>>>> outputs, but since the spec says unwritten outputs are undefined that
>>>>>> would be just fine (for OpenGL - not sure about other APIs).
>>>>>> 2) Use some explicit means to distinguish FragData[] from FragColor in
>>>>>> gallium. For instance, could use different semantic name (like
>>>>>> TGSI_SEMANTIC_COLOR and TGSI_SEMANTIC_GENERIC for the respective
>>>>>> outputs). Or could have a flag somewhere (not quite sure where) saying
>>>>>> if color output is to be replicated to all buffers.
>>>>>> 3) Translate away the single color output in state tracker to multiple
>>>>>> outputs.
>>>>>>
>>>>>> I don't like option 3) though. Means we need to recompile if the
>>>>>> attached buffers change. Moreover, it seems both new nvidia and AMD
>>>>>> chips (r600 has MULTIWRITE_ENABLE bit) handle this just fine in hw.
>>>>>> I don't like option 1) neither, that kind of implicit behavior might be
>>>>>> ok but this kind of guesswork isn't very nice imho.
>>>>> Whatever's easiest, just document it. I'd be cool with:
>>>>>
>>>>> DECL IN[0], COLOR, PERSPECTIVE
>>>>> DECL OUT[0], COLOR
>>>>> MOV OUT[0], IN[0]
>>>>> END
>>>>>
>>>>> Effectively being a write to all color buffers, however, this one from
>>>>> progs/tests/drawbuffers:
>>>>>
>>>>> DCL IN[0], COLOR, LINEAR
>>>>> DCL OUT[0], COLOR
>>>>> DCL OUT[1], COLOR[1]
>>>>> IMM FLT32 {     1.0000,     0.0000,     0.0000,     0.0000 }
>>>>>  0: MOV OUT[0], IN[0]
>>>>>  1: SUB OUT[1], IMM[0].xxxx, IN[0]
>>>>>  2: END
>>>>>
>>>>> Would then double-write the second color buffer. Unpleasant. Language
>>>>> like this would work, I suppose?
>>>>>
>>>>> """
>>>>> If only one color output is declared, writes to the color output shall
>>>>> be redirected to all bound color buffers. Otherwise, color outputs
>>>>> shall be bound to their specific color buffer.
>>>>> """
>>>> Also, keep in mind that writing to multiple color buffers uses
>>>> additional memory bandwidth, so for performance, we should only do so
>>>> when required.
>>> Do apps really have several color buffers bound but only write to one,
>>> leaving the state of the others undefined in the process? Sounds like a
>>> poor app to begin with to me.
>>> Actually, I would restrict that language above further, so only color
>>> output 0 will get redirected to all buffers if it's the only one
>>> written. As said though I'd think some explicit bits somewhere are
>>> cleaner. I'm not yet sure that the above would really work for all APIs,
>>> it is possible some say other buffers not written to are left as is
>>> instead of undefined.
>> Who knows, the GL API allows for it, I don't see how we can
>> arbitrarily decide to restrict it.
>>
>> I could write an app that uses multiple fragment programs, and
>> switches between them, with two outputs buffers bound, though I'm
>> possibly constructing something very arbitary.
> I fail to see the problem. If you have two color buffers bound but only
> write to one of them then the implementation is allowed to do anything
> it wants with the other one as far as I can tell.
> 
> 
>> The ARB_draw_buffers explicitly states that Data0 != Color.
> Yes. I wonder though are there other differences somewhere (I couldn't
> find any) that one gets replicated the other not?
> 
> Anyway, it looks like noone likes that implicit option.
> Hence let's make it explicit in gallium.
> Not quite sure how yet - this seems to be some sort of shader state. We
> could use new semantic for that special replicated output, or redefine
> the existing ones (use generic ones for data outputs and color only for
> the replicated one). Or maybe we should just make that a tgsi shader
> property like those for pixel centers?

Here's a key question: on NV and AMD with the "MULTIWRITE_ENABLE" 
flag, does that option perform faster than using N shader instructions 
to write to the N buffers?

If so, that would point to a new shader semantic.

Otherwise, if there's no gain, I think I'd rather solve the problem 
for all drivers in the state tracker.  We'd only have to generate a 
new fragment shader variant when the _number_ of attached color 
buffers changes, not when the buffer pointers change.  I think this is 
do-able.

-Brian