[Mesa-dev] [PATCH 5/6] mesa: Implement a new GL_MESA_tile_raster_order extension.

Nicolai Hähnle nhaehnle at gmail.com
Sat Jul 29 07:42:54 UTC 2017


On 28.07.2017 18:26, Eric Anholt wrote:
> Nicolai Hähnle <nhaehnle at gmail.com> writes:
> 
>> On 28.07.2017 04:30, Eric Anholt wrote:
>>> The intent is to use this extension on vc4 to allow X11 to do overlapping
>>> CopyArea() within a pixmap without first blitting the pixmap to a
>>> temporary.  With associated glamor patches, improves x11perf
>>> -copywinwin100 performance on a Raspberry Pi 3 from ~4700/sec to
>>> ~5130/sec, and is an even larger boost to uncomposited window movement
>>> performance (most copywinwin100 copies don't overlap).
>>> ---
>>>    docs/specs/MESA_tile_raster_order.spec        | 101 ++++++++++++++++++++++++++
>>>    docs/specs/enums.txt                          |   5 ++
>>>    include/GL/glext.h                            |   6 ++
>>>    src/mapi/glapi/gen/MESA_tile_raster_order.xml |  13 ++++
>>>    src/mapi/glapi/gen/Makefile.am                |   1 +
>>>    src/mesa/main/context.c                       |   2 +
>>>    src/mesa/main/enable.c                        |  27 +++++++
>>>    src/mesa/main/extensions_table.h              |   1 +
>>>    src/mesa/main/mtypes.h                        |  13 ++++
>>>    src/mesa/state_tracker/st_atom_rasterizer.c   |   5 ++
>>>    src/mesa/state_tracker/st_context.c           |   1 +
>>>    src/mesa/state_tracker/st_extensions.c        |   1 +
>>>    12 files changed, 176 insertions(+)
>>>    create mode 100644 docs/specs/MESA_tile_raster_order.spec
>>>    create mode 100644 src/mapi/glapi/gen/MESA_tile_raster_order.xml
>>>
>>> diff --git a/docs/specs/MESA_tile_raster_order.spec b/docs/specs/MESA_tile_raster_order.spec
>>> new file mode 100644
>>> index 000000000000..b6362fc7e5a6
>>> --- /dev/null
>>> +++ b/docs/specs/MESA_tile_raster_order.spec
>>> @@ -0,0 +1,101 @@
>>> +Name
>>> +
>>> +    MESA_tile_raster_order
>>> +
>>> +Name Strings
>>> +
>>> +    GL_MESA_tile_raster_order
>>> +
>>> +Contact
>>> +
>>> +    Eric Anholt, Broadcom (eric at anholt.net)
>>> +
>>> +Status
>>> +
>>> +    Proposal
>>> +
>>> +Version
>>> +
>>> +    Last modified date: 26 July 2017
>>> +
>>> +Number
>>> +
>>> +    TBD
>>> +
>>> +Dependencies
>>> +
>>> +    GL_ARB_texture_barrier is required
>>> +
>>> +Overview
>>> +
>>> +    This extension extends the sampling-from-the-framebuffer behavior provided
>>> +    by GL_ARB_texture_barrier to allow setting the rasterization order of the
>>> +    scene, so that overlapping blits can be implemented.  This can be used for
>>> +    scrolling or window movement within in 2D scenes, without first copying to
>>> +    a temporary.
>>> +
>>> +IP Status
>>> +
>>> +    None
>>> +
>>> +Issues
>>> +
>>> +    1.  Should this extension also affect BlitFramebuffer?
>>> +
>>> +        NOT RESOLVED: BlitFramebuffer could use the same underlying
>>> +        functionality to provide defined results for 1:1 overlapping blits,
>>> +        but one could use the coordinates being copied to just produce the
>>> +        right result automatically, rather than requiring the state flags to
>>> +        be adjusted.
>>> +
>>> +New Procedures and Functions
>>> +
>>> +    None
>>> +
>>> +New Tokens
>>> +
>>> +    None
>>> +
>>> +Additions to Chapter 9 of the OpenGL 4.4 Specification (Per-Fragment
>>> +Operations and the Frame Buffer)
>>> +
>>> +    Modify Section 9.3.1 Rendering Feedback Loops, p. 289
>>> +
>>> +    Add the following exception to the list after "There is only a single read
>>> +    and write...":
>>> +
>>> +    - TILE_RASTER_ORDER_FIXED_MESA is enabled, and reads are only performed
>>> +      from texels offset from the current fragment shader invocation in the
>>> +      direction specified by TILE_RASTER_ORDER_INCREASING_X_MESA and
>>> +      TILE_RASTER_ORDER_INCREASING_Y_MESA (where those being disabled mean
>>> +      negative texel offsets), e.g. using "texelFetch2D(sampler,
>>> +      ivec2(gl_FragCoord.xy + vec2(dx, dy)), 0);".
>>
>> I don't think we can implement any of this on our hardware, but
>> regardless, I'm not sure that the spec language makes sense even for a
>> tiled renderer.
>>
>> Consider what happens when you're doing an overlapping blit in a
>> maximally stupid way, by creating a pair of triangles for each pixel and
>> then shuffling those triangles randomly. Is your hardware really capable
>> of re-ordering all the generated fragments according to the prescribed
>> raster order, even though they all originate from different primitives?
>>
>> I suspect that a lot of hardware would even be challenged along the
>> diagonal of a single pair of triangles covering the blit rectangle.
> 
> Yeah, the naming here is because I don't think the extension can be
> implemented on a non-tiled rasterizer.  I'm relying on the tiles to work
> around the fact that pixels in a triangle will be executed in an
> arbitrary order, and then you stuff the results of all the fragments
> back in memory once all the prims have been rasterized..

This makes sense. A non-tiled renderer could still have different 
rasterization orders, but it only really helps as long as you have only 
a single pixel pipe. Our architecture has multiple pixel pipes assigned 
to different tiles of the screen, and no way of enforcing an order 
between them, so I don't think we have any hope of implementing this. 
Luckily it's not important for us.


> As far as having the blit be composed of multiple primitives, you have
> to emit your primitives following the raster order, or you'd have writes
> before reads happening and you wouldn't get the rendering you want.  (A
> tiled renderer might hide that from you anyway, though, as long as a
> flush didn't implicitly happen between your primitives).

Right, that makes sense.


> Note, I had been reading this case in the texture_barrier extension:
> 
>      - If a texel has been written, then in order to safely read the result
>        a texel fetch must be in a subsequent Draw separated by the command
> 
>          void TextureBarrier(void);
> 
>        TextureBarrier() will guarantee that writes have completed and caches
>        have been invalidated before subsequent Draws are executed."
> 
> as being sort of separate from the rest of the list, an "in addition,
> ..." instead of "as long as you've done this, you're entirely safe".

The text above the list says that any of the items are sufficient for 
correctness, but I think the language in this bullet itself is 
misleading if you parse it in spec-lawyer mode together with your 
addition. It should really be:

     - If a texel has been written, then a texel fetch safely reads
       the result if it occurs in a subsequent Draw separated by the
       command

         void TextureBarrier(void);

       ...

In other words, there may be *other* conditions that allow a texel fetch 
to safely read the result of a previously written texel, and the 
extension proposal here adds the first such other condition.

Perhaps you could add that edit to the extension language.


> Maybe my paragraph should be something like:
> 
>      - TILE_RASTER_ORDER_FIXED_MESA is enabled, and there is only a
>        single write of each texel, and primitives are emitted in the
>        order of TILE_RASTER_ORDER_INCREASING_X/Y_MESA (where those being
>        disabled mean negative texel offsets), and reads are only
>        performed from texels offset from the current fragment shader
>        invocation in the direction specified by
>        TILE_RASTER_ORDER_INCREASING_X/Y_MESA, e.g. using
>        "texelFetch2D(sampler, ivec2(gl_FragCoord.xy + vec2(dx, dy)),
>        0);".

I agree that that's better.

Cheers,
Nicolai
-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.


More information about the mesa-dev mailing list