[Mesa-dev] [RFC] [BRANCH] Floating point textures and rendering for Mesa, softpipe and llvmpipe

Fri Aug 27 05:49:50 PDT 2010

I created a new branch called "floating" which includes an apparently
successful attempt at full support of floating-point textures and
render targets.

I believe it is fundamentally correct, but should be considered a
prototype and almost surely contains some oversights.

Specifically, the following is included:
1. GL_ARB_half_float_pixel is advertised
2. Gallium luminance/alpha/intensity floating point formats are added
3. GL_ARB_texture_float is finished and renamed from GL_MESAX_texture_float
4. Support for not clamping colors supplied to the GL but clamping
queries is added
5. Fragment, vertex and read color (non-)clamping is implemented
6. GL_ARB_color_buffer_float is implemented and advertised (this is
the tricky one)
7. Vertex and fragment clamping is pushed through Gallium using a
temporary interface
8. draw and draw_llvm get proper vertex color clamping support
9. For softpipe, blending is reworked to work with both fixed and
floating point target, and fragment clamping is properly supported
10. llvmpipe is converted to use float tiles instead of uint8_t tiles
11. For llvmpipe, blending is reworked to work with both fixed and
floating point target, and fragment clamping is properly supported

The Mesa parts should be the right approach, but might have minor issues.
The softpipe code should also be good.
The gallium interface is instead provisional and the llvmpipe code
shouldn't be merged in the current form.

As an immediate result, Unigine demos that failed due to framebuffer
incompleteness or lack of fp extensions now work.
However, softpipe is unusable with them and llvmpipe is also slow and
very glitchy.
Furthermore, shader compilation takes a lot of time: expect loading to
take even 5-10 minutes on fast machine, and possibly much more on
slower machines.
Unigine Tropics seems to work best, followed by Unigine Sanctuary.
Unigine Heaven works too but with really heavy glitches.

More details follow.

* How to test this

- floattex in mesa-demos
- ARB_color_buffer_float testsuite on the piglit mailing list
- Philip Rideout's bloom/hdr demo at http://prideout.net/archive/bloom/#HDR
- Stephane Metz's "HDR in OpenGL" at http://www.smetz.fr/?page_id=83
- Unigine demos
- Games with HDR support

* ARB_half_float_pixel

This one just requires support for half float datatypes in user
memory, which we already had.
Hence, just turn it on in mesa/st

* ARB_texture_float

Currently Mesa implements MESAX_texture_float, which as far as I can
tell conforms to ARB_texture_float, except for the fact that we did
not clamp texels sampled from the fixed function pipeline.
Hence, I added that and renamed the extension to the ARB name (I doubt
anyone relied exclusively on the MESAX_* name).

Support in mesa/st was missing though. To add it, I added new
luminannce/alpha/intensity formats to Gallium and converted the Mesa
internal formats to/from them.

It is then enabled if any floating point formats are supported.

* ARB_color_buffer_float

This is the hard part, and despite the name doesn't actually require
floating-point support.

The first part of the extension adds GLX support for floating point
formats (as GLX_ARB_fbconfig_float): this is currently ignored, since
I suspect no one cares about GLX pbuffers.
Support for computing floatMode from FBOs and the RGBA_FLOAT_MODE_ARB
query for it is added.

The main part of this extension is however the clamping controls.
Before this extension, many parts of the GL are specified to clamp
colors: obviously, this is undesirable for floating-point rendering.

Hence, three clamping controls are added by extension:
- Vertex color clamping, which is equivalent to adding a saturate in
any instruction that writes to COLOR or BCOLOR in the vertex shader
- Fragment color clamping, which is equivalent to adding a saturate in
any instruction that writes to COLOR in the fragment shader, but has
many other implications
- Read color clamping, that controls whether ReadPixels and friend
(but not GetTexImage) clamp the returned colors.

Fragment color clamping has other very broad implications.
First of all, it controls computation within fixed function fragment
shading, which is handled in the texenv programs.
Second, it controls whether the results of queries for colors are
clamped. This is because the extension specifies that functions like
glClearColor no longer clamp the inputs. For compatibility, they
decide to still clamp the queries, unless fragment clamping is
disabled.

These clamping controls were already present but essentially unused
and ClampColorARB was not exposed.
These controls now set _NEW_* flags, and derived state is computed to
resolve FIXED_ONLY_ARB to TRUE or FALSE.

Further, blending is clarified.
In particular, if the specific destination buffer is floating point,
blending clamps nothing; otherwise, the blend source, destination
input, factors and blend results are all clamped.

A naive look may make one think that since blending already clamps the
source, fragment clamping has no effect with fixed point destination
buffers.
This is not correct however, because polygon smoothing coverage
multiplication and the alpha test happen between fragment clamping and
either source clamping by blend or clamping due to writing if blending
is disabled.
However, vertex and fragment clamping are indeed totally equvalent to
setting the saturate bit on shader output instructions for
COLOR/BCOLOR semantics.

The specification is unclear on whether clamping before fog
application in fragment programs using ARB_fog_* is now controlled by
fragment clamping or not.
Currently, and even with this patchset, we always clamp it.
If this is not correct, we will have to keep two versions of
ARB_fragment_program programs, with and without such clamping.
The specification seems unclear on this point.

* Gallium and ARB_color_buffer_float

The current Gallium interface for ARB_color_buffer_float is temporary
and I'm not sure what the best choice is.
Basically, we need to push the fragment and vertex clamping bits.
Also, we need to clarify whether clamping happens or not, and how is
blending done.
Right now, I think clamping may actually be broken since softpipe and
llvmpipe never clamp colors, and mesa/st doesn't seem to provide any
clamping its.

This patchset adds two boolean in the rasterizer state for clamping.
This is conceptually wrong though, because vertex clamping happens
before the geometry shader, while fragment clamping happens after the
fragment shader, and thus they don't belong in rasterization.

Options are:
1. Put the bits in some CSOs
2. Add separate functions to set them
3. Have the Mesa generic code keep two shader versions, and hack the
saturate bit on output instructions
4. Have the Gallium state tracker keep two shader versions, and hack
the saturate bit on output instructions
5. Add linkage CSOs and put a color clamping bit there

Input on what you think the best choice is would be greatly appreciated.

Note that if it is decided that ARB_fog_* alters clamping, then we
either have to keep multiple shader versions, or add fog support to
Gallium, so that the separate clamping controls happen before it.

All Radeons including r600 and nVidia cards before nv50 have fixed
function fog support, so it might make sense to add fog to Gallium
(the comment in the Mesa source that "no hardware wants to use fixed
function fog" is blatantly wrong).

In addition, we change Gallium so that color inputs like blend and
clear colors are now never clamped.
Blending is clarified to behave as specified by OpenGL (hopefully
Direct3D requires the same).

Also, two cap bits are added, and if they are not supported, color
clamping is supposed to be *always* performed (this may or may not be
a change from the current interface).

* Draw

Draw and draw_llvm have been changed to perform vertex color clamping
if specified.
Previously they never did so, which was broken.

The SSE and PPC paths are now permanently disabled, until someone
either adds vertex clamping there, or removes them from the source
tree.

* Softpipe

Softpipe has been changed to blend properly with appropriate clamping.
Also, it now properly clamps fragment color: it was broken before
because it never did so.

* LLVMpipe

LLVMpipe has the fundamental problem that it uses an intermediate
tiled format with byte components for all surface formats.

To resolve this, this patchset switches it to use float components for
all formats.
Note that this is still a bad choice, and the same format of the
surface should be used if natively supported by the CPU, or otherwise
a suitable more precise format should be chosen.

After this drastic change, the modifications imitate the changes to softpipe.

Note that this code must not be merged as is, since using floats will
likely degrade performance and memory usage unacceptably.
The tile code should instead be rewritten to support all possible tile formats.

Also, while this patch blends in floating point, it should blend in
the destination format, since that gets correct clamping for free and
is faster, since the destination format is often smaller.

* Unsolved stuff

- Unclear specification about fragment program fog
- We assume that fragment clamping with FIXED_ONLY does *not* change
depending on the color buffer in MRT cases, if there are fixed point
and floating point buffers. The specification is contradictory on
this.
- Do display lists need anything special?
- Do we want/need to add GLX support?

* Merging this branch

To merge this, the following would be necessary:
1. Is there some way to regenerate glapi without creating a massively huge diff?
2. Decide on a Gallium interface and implement it
3. Review the patchset and make sure it actually conforms to the
specification. Due to the trickiness of it, I almost surely made
mistakes.
4. Rewrite the tiling code in llvmpipe to support arbitrary formats