[Mesa-dev] [PATCH 5/6] mesa: Implement a new GL_MESA_tile_raster_order extension.
Nicolai Hähnle
nhaehnle at gmail.com
Sat Jul 29 07:42:54 UTC 2017
On 28.07.2017 18:26, Eric Anholt wrote:
> Nicolai Hähnle <nhaehnle at gmail.com> writes:
>
>> On 28.07.2017 04:30, Eric Anholt wrote:
>>> The intent is to use this extension on vc4 to allow X11 to do overlapping
>>> CopyArea() within a pixmap without first blitting the pixmap to a
>>> temporary. With associated glamor patches, improves x11perf
>>> -copywinwin100 performance on a Raspberry Pi 3 from ~4700/sec to
>>> ~5130/sec, and is an even larger boost to uncomposited window movement
>>> performance (most copywinwin100 copies don't overlap).
>>> ---
>>> docs/specs/MESA_tile_raster_order.spec | 101 ++++++++++++++++++++++++++
>>> docs/specs/enums.txt | 5 ++
>>> include/GL/glext.h | 6 ++
>>> src/mapi/glapi/gen/MESA_tile_raster_order.xml | 13 ++++
>>> src/mapi/glapi/gen/Makefile.am | 1 +
>>> src/mesa/main/context.c | 2 +
>>> src/mesa/main/enable.c | 27 +++++++
>>> src/mesa/main/extensions_table.h | 1 +
>>> src/mesa/main/mtypes.h | 13 ++++
>>> src/mesa/state_tracker/st_atom_rasterizer.c | 5 ++
>>> src/mesa/state_tracker/st_context.c | 1 +
>>> src/mesa/state_tracker/st_extensions.c | 1 +
>>> 12 files changed, 176 insertions(+)
>>> create mode 100644 docs/specs/MESA_tile_raster_order.spec
>>> create mode 100644 src/mapi/glapi/gen/MESA_tile_raster_order.xml
>>>
>>> diff --git a/docs/specs/MESA_tile_raster_order.spec b/docs/specs/MESA_tile_raster_order.spec
>>> new file mode 100644
>>> index 000000000000..b6362fc7e5a6
>>> --- /dev/null
>>> +++ b/docs/specs/MESA_tile_raster_order.spec
>>> @@ -0,0 +1,101 @@
>>> +Name
>>> +
>>> + MESA_tile_raster_order
>>> +
>>> +Name Strings
>>> +
>>> + GL_MESA_tile_raster_order
>>> +
>>> +Contact
>>> +
>>> + Eric Anholt, Broadcom (eric at anholt.net)
>>> +
>>> +Status
>>> +
>>> + Proposal
>>> +
>>> +Version
>>> +
>>> + Last modified date: 26 July 2017
>>> +
>>> +Number
>>> +
>>> + TBD
>>> +
>>> +Dependencies
>>> +
>>> + GL_ARB_texture_barrier is required
>>> +
>>> +Overview
>>> +
>>> + This extension extends the sampling-from-the-framebuffer behavior provided
>>> + by GL_ARB_texture_barrier to allow setting the rasterization order of the
>>> + scene, so that overlapping blits can be implemented. This can be used for
>>> + scrolling or window movement within in 2D scenes, without first copying to
>>> + a temporary.
>>> +
>>> +IP Status
>>> +
>>> + None
>>> +
>>> +Issues
>>> +
>>> + 1. Should this extension also affect BlitFramebuffer?
>>> +
>>> + NOT RESOLVED: BlitFramebuffer could use the same underlying
>>> + functionality to provide defined results for 1:1 overlapping blits,
>>> + but one could use the coordinates being copied to just produce the
>>> + right result automatically, rather than requiring the state flags to
>>> + be adjusted.
>>> +
>>> +New Procedures and Functions
>>> +
>>> + None
>>> +
>>> +New Tokens
>>> +
>>> + None
>>> +
>>> +Additions to Chapter 9 of the OpenGL 4.4 Specification (Per-Fragment
>>> +Operations and the Frame Buffer)
>>> +
>>> + Modify Section 9.3.1 Rendering Feedback Loops, p. 289
>>> +
>>> + Add the following exception to the list after "There is only a single read
>>> + and write...":
>>> +
>>> + - TILE_RASTER_ORDER_FIXED_MESA is enabled, and reads are only performed
>>> + from texels offset from the current fragment shader invocation in the
>>> + direction specified by TILE_RASTER_ORDER_INCREASING_X_MESA and
>>> + TILE_RASTER_ORDER_INCREASING_Y_MESA (where those being disabled mean
>>> + negative texel offsets), e.g. using "texelFetch2D(sampler,
>>> + ivec2(gl_FragCoord.xy + vec2(dx, dy)), 0);".
>>
>> I don't think we can implement any of this on our hardware, but
>> regardless, I'm not sure that the spec language makes sense even for a
>> tiled renderer.
>>
>> Consider what happens when you're doing an overlapping blit in a
>> maximally stupid way, by creating a pair of triangles for each pixel and
>> then shuffling those triangles randomly. Is your hardware really capable
>> of re-ordering all the generated fragments according to the prescribed
>> raster order, even though they all originate from different primitives?
>>
>> I suspect that a lot of hardware would even be challenged along the
>> diagonal of a single pair of triangles covering the blit rectangle.
>
> Yeah, the naming here is because I don't think the extension can be
> implemented on a non-tiled rasterizer. I'm relying on the tiles to work
> around the fact that pixels in a triangle will be executed in an
> arbitrary order, and then you stuff the results of all the fragments
> back in memory once all the prims have been rasterized..
This makes sense. A non-tiled renderer could still have different
rasterization orders, but it only really helps as long as you have only
a single pixel pipe. Our architecture has multiple pixel pipes assigned
to different tiles of the screen, and no way of enforcing an order
between them, so I don't think we have any hope of implementing this.
Luckily it's not important for us.
> As far as having the blit be composed of multiple primitives, you have
> to emit your primitives following the raster order, or you'd have writes
> before reads happening and you wouldn't get the rendering you want. (A
> tiled renderer might hide that from you anyway, though, as long as a
> flush didn't implicitly happen between your primitives).
Right, that makes sense.
> Note, I had been reading this case in the texture_barrier extension:
>
> - If a texel has been written, then in order to safely read the result
> a texel fetch must be in a subsequent Draw separated by the command
>
> void TextureBarrier(void);
>
> TextureBarrier() will guarantee that writes have completed and caches
> have been invalidated before subsequent Draws are executed."
>
> as being sort of separate from the rest of the list, an "in addition,
> ..." instead of "as long as you've done this, you're entirely safe".
The text above the list says that any of the items are sufficient for
correctness, but I think the language in this bullet itself is
misleading if you parse it in spec-lawyer mode together with your
addition. It should really be:
- If a texel has been written, then a texel fetch safely reads
the result if it occurs in a subsequent Draw separated by the
command
void TextureBarrier(void);
...
In other words, there may be *other* conditions that allow a texel fetch
to safely read the result of a previously written texel, and the
extension proposal here adds the first such other condition.
Perhaps you could add that edit to the extension language.
> Maybe my paragraph should be something like:
>
> - TILE_RASTER_ORDER_FIXED_MESA is enabled, and there is only a
> single write of each texel, and primitives are emitted in the
> order of TILE_RASTER_ORDER_INCREASING_X/Y_MESA (where those being
> disabled mean negative texel offsets), and reads are only
> performed from texels offset from the current fragment shader
> invocation in the direction specified by
> TILE_RASTER_ORDER_INCREASING_X/Y_MESA, e.g. using
> "texelFetch2D(sampler, ivec2(gl_FragCoord.xy + vec2(dx, dy)),
> 0);".
I agree that that's better.
Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
More information about the mesa-dev
mailing list