Mesa (broadwell): 35 new commits

Kenneth Graunke kwg at kemper.freedesktop.org
Thu Feb 20 23:56:05 UTC 2014


URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=beb53a72f578e0bb99f684ee0c14d5bbdf565638
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Mon Jan 20 23:56:38 2014 -0800

    stash - vp fixes?
    
    doesn't seem to actually fix anything.
    
    oh, this might only be relevant when I turn on the viewport extents test

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e57d936bf4464035c8f980cdfd6dc08f673595bf
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Tue Jan 28 22:02:56 2014 -0800

    Also emit VF_INSTANCING in the no-elements case.
    
    I can't imagine why this would matter since there's no actual data being
    pulled, but...for safety?
    
    Not observed to help anything

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=02977fabb1dfc39e80d9838b9afd57932a9c9778
Author: Eric Anholt <eric at anholt.net>
Date:   Thu Feb 13 09:30:41 2014 -0800

    i965: Drop the blitter-based glBlitFramebuffer() path.
    
    Now that meta can do single-copy blits even from renderbuffers and depth,
    we don't need this any more.

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=956d97bdd3533c11e602dfa530fa2406ee58bca0
Author: Eric Anholt <eric at anholt.net>
Date:   Thu Feb 13 09:31:38 2014 -0800

    i965: Drop blorp for glBlitFramebuffer() except for the stencil case.
    
    I still need to do stencil for meta's blit path.

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=22332a89c94a4d4fa9afced0104fc59a7bfe4088
Author: Eric Anholt <eric at anholt.net>
Date:   Mon Feb 10 23:44:54 2014 -0800

    meta: Add support for integer blits.

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e05b1339d46dd9f310085e39baa5ce7a88ff7385
Author: Eric Anholt <eric at anholt.net>
Date:   Mon Feb 10 15:24:07 2014 -0800

    meta: Add support for doing MSAA to MSAA blits.

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e9a5853971d94703f93dd5464e77f6430da60627
Author: Eric Anholt <eric at anholt.net>
Date:   Mon Feb 10 16:23:50 2014 -0800

    meta: Save and restore a bunch of MSAA state.
    
    We're disabling GL_MULTISAMPLE, so we didn't need to worry about a lot of
    that state.  But to do MSAA to MSAA blits, we need to start handling more
    state.
    
    v2: Fix pasteo caught by Kenneth.

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=7d6909bceec9fc5618de6a665eceb1dcdc649da2
Author: Eric Anholt <eric at anholt.net>
Date:   Mon Feb 10 11:20:11 2014 -0800

    meta: Try to do blending of sRGB values in linear colorspace.
    
    Blending of values would occur when doing GL_LINEAR filtering with
    scaling, and in an upcoming commit when doing MSAA resolves.
    
    !UPSTREAM: [citation needed]

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6e22d1dacfca83775ae5413df31612f94b2a4238
Author: Eric Anholt <eric at anholt.net>
Date:   Fri Feb 7 14:00:31 2014 -0800

    meta: Add support for doing multisample resolves.

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=926deed02bcd4e75dc79e20665670a5c08c97f4a
Author: Eric Anholt <eric at anholt.net>
Date:   Tue Feb 18 15:14:30 2014 -0800

    i965: Fix miptree matching for multisampled, non-interleaved miptrees.
    
    We haven't been executing this code before the meta-blit case, because
    we've been flagging the miptree as validated at texstorage time, and never
    having to revalidate.

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=f104100094a319fcd16cb5d30d7591c37a69e1ed
Author: Eric Anholt <eric at anholt.net>
Date:   Wed Feb 5 10:54:51 2014 -0800

    fix stretch blit

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=1336ccb7dd63b64ef5d8bb0a7f57d6291b0f3a97
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Mon Dec 30 22:07:20 2013 -0800

    i965: Enable Broadwell support.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Eric Anholt <eric at anholt.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=808952a09543b60e59c5ad9238d8403fa9f1f15b
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Wed Jan 29 13:45:27 2014 -0800

    i965/fs: Implement FS_OPCODE_[UN]PACK_HALF_2x16_SPLIT[_XY] opcodes.
    
    I'd neglected to port these to Broadwell.  Most of this code is copy
    and pasted from Gen7, but instead of using F32TO16/F16TO32, we just
    use MOV with HF register types.
    
    Fixes fs-packHalf2x16 and fs-unpackHalf2x16 tests (both the ARB
    extension and ES 3.0 variants).
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Eric Anholt <eric at anholt.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=850e372fc7ff3377d7ffdf825d5ebcdd72beee1b
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Wed Jan 29 14:16:27 2014 -0800

    i965: Drop bogus F32TO16/F16TO32 instructions on Broadwell - use MOV.
    
    Broadwell removed the F32TO16 and F16TO32 instructions.  However, it has
    actual support for HF values, so they're actually just MOV.
    
    Fixes vs-packHalf2x16 and vs-unpackHalf2x16 tests (both the ARB
    extension and ES 3.0 variants).
    
    v2: Emulate F32TO16's align16 zeroing bug, since Chad's front end code
        relies on it happening.  We can probably refactor this code to be
        better later.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Eric Anholt <eric at anholt.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=3663bbe773187dee341556ef29e58b1143ef2f5c
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Wed Feb 19 17:20:11 2014 -0800

    i965: Create a hardware context before initializing state module.
    
    brw_init_state() calls brw_upload_initial_gpu_state().  If hardware
    contexts are enabled (brw->hw_ctx != NULL), this will upload some
    initial invariant state for the GPU.  Without hardware contexts, we
    rely on this state being uploaded via atoms that subscribe to the
    BRW_NEW_CONTEXT bit.
    
    Commit 46d3c2bf4ddd227193b98861f1e632498fe547d8 accidentally moved
    the call to brw_init_state() before creating a hardware context.
    This meant brw_upload_initial_gpu_state would always early return.
    Except on Gen6+, we stopped uploading the initial GPU state via
    state atoms, so it never happened.
    
    Fixes a regression since 46d3c2bf4ddd227193b98861f1e632498fe547d8.
    
    Cc: "10.0 10.1" <mesa-stable at lists.freedesktop.org>
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
    Reviewed-by: Eric Anholt <eric at anholt.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e3823147a5f5e9c6234d8e89a55b79e8e9eb164c
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Mon Jan 27 15:49:56 2014 -0800

    i965/fs: Implement scratch read/write support for Broadwell.
    
    To make sure that both the Gen4 and Gen7 style messages work, I
    initially disabled the SHADER_OPCODE_GEN7_SCRATCH_READ optimization,
    ran Piglit, re-enabled it, and ran Piglit again.  Both worked fine.
    
    Fixes 40 Piglit tests (most of the varying-packing category).
    
    v2: Move num_regs assertion from gen8_fs_generator to
        gen8_set_dp_scratch_message() (suggested by Eric).
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Eric Anholt <eric at anholt.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=29a69744034c37ebe1ba088fbc8bbd39b7a17875
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Mon Jan 27 15:44:18 2014 -0800

    i965: Add Gen8 assembly support for DP Scratch messages.
    
    The new accessors will make it easy to do Gen7-style scratch messages.
    
    v2: Move num_regs assertion from gen8_fs_generator into
        gen8_set_dp_scratch_message() (suggested by Eric).
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Eric Anholt <eric at anholt.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=a5e54c91a3b73551609efea1f6f31eaae26281ea
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Tue Feb 4 22:18:03 2014 -0800

    i965: Store absolute thread count in max_wm_threads on Broadwell.
    
    In the past, 3DSTATE_PS took an absolute number of threads.  Conversely,
    on Broadwell you always program 64, and it implicitly scales based on
    the GT-level with no special programming.  So, I stored 64 in
    brw_device_info::max_wm_threads.
    
    However, I didn't realize that we also use max_wm_threads to compute the
    size of the scratch space buffer.  In that case, we really need the
    absolute number of threads.
    
    This patch hardcodes 3DSTATE_PS to use the value it expects, and changes
    max_wm_threads back to a (completely fake) absolute thread count (once
    again copied from Haswell).
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Eric Anholt <eric at anholt.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=dca84b4b5b23b68b3ea9da53d1775fa22cd1aff4
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Thu Jan 30 15:30:19 2014 -0800

    i965: Use MOV, not OR for setting URB write channel enables on Gen8+.
    
    On Broadwell, g0.5 contains the "Scratch Space Pointer"; using OR
    puts some bits of that into "ignored" sections of our message header.
    
    While this doesn't hurt, it's also not terribly /useful/.  Using MOV
    is sufficient to set the only interesting bits in this part of the
    message header.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Eric Anholt <eric at anholt.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e643c7d036d322c2898c9e65e466d75d0c708dc2
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Sun Jan 26 00:20:21 2014 -0800

    i965: Implement a CS stall workaround on Broadwell.
    
    According to the latest documentation, any PIPE_CONTROL with the
    "Command Streamer Stall" bit set must also have another bit set,
    with five different options:
    
       - Render Target Cache Flush
       - Depth Cache Flush
       - Stall at Pixel Scoreboard
       - Post-Sync Operation
       - Depth Stall
    
    I chose "Stall at Pixel Scoreboard" since we've used it effectively
    in the past, but the choice is fairly arbitrary.
    
    Implementing this in the PIPE_CONTROL emit helpers ensures that the
    workaround will always take effect when it ought to.
    
    Apparently, this workaround may be necessary on older hardware as well;
    for now I've only added it to Broadwell as it's absolutely necessary
    there.  Subsequent patches could add it to older platforms, provided
    someone tests it there.
    
    v2: Only flag "Stall at Pixel Scoreboard" when none of the other bits
        are set (suggested by Ian Romanick).
    
    v3: Prefix the function with "gen8" (requested by Eric).
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Ian Romanick <ian.d.romanick at intel.com> (v2)
    Reviewed-by: Eric Anholt <eric at anholt.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=741782b5948bb3d01d699f062a37513c2e73b076
Author: Jordan Justen <jordan.l.justen at intel.com>
Date:   Sat Jan 25 13:02:08 2014 -0800

    i965: support instanced GS on gen7
    
    v3:
     * Properly prevent dual object mode execution when
       the invocation count > 1
    
    Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>
    Reviewed-by: Anuj Phogat <anuj.phogat at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=008338bc4e2d9cc5931b9968d019619c09392389
Author: Jordan Justen <jordan.l.justen at intel.com>
Date:   Sat Jan 25 12:55:24 2014 -0800

    i965: support gl_InvocationID for gen7
    
    v2:
     * Make gl_InvocationID a system value
    
    v3:
     * Properly shift from R0.1 into DST.4 by adding
       GS_OPCODE_GET_INSTANCE_ID
    
    Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
    Acked-by: Paul Berry <stereotype441 at gmail.com>
    Reviewed-by: Anuj Phogat <anuj.phogat at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=d09901993539385c015c6389310c186cba9bb263
Author: Jordan Justen <jordan.l.justen at intel.com>
Date:   Sat Jan 25 12:43:26 2014 -0800

    glsl: add gl_InvocationID variable for ARB_gpu_shader5
    
    v2:
     * Make gl_InvocationID a system value
    
    Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>
    Reviewed-by: Anuj Phogat <anuj.phogat at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=22388e2208a9a321240ec505f513fa5de5af8946
Author: Jordan Justen <jordan.l.justen at intel.com>
Date:   Sat Jan 25 12:37:40 2014 -0800

    main/shaderapi: GL_GEOMETRY_SHADER_INVOCATIONS GetProgramiv support
    
    v3:
     * Add check for ARB_gpu_shader5
    
    Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>
    Reviewed-by: Anuj Phogat <anuj.phogat at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=86d6b5546b75ac7d5eedc26c14f579a4bfb40406
Author: Jordan Justen <jordan.l.justen at intel.com>
Date:   Sat Jan 25 12:34:24 2014 -0800

    mesa: initialize gl_geometry_program Invocations field
    
    Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>
    Reviewed-by: Anuj Phogat <anuj.phogat at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=313402048fdad05d3401340129b9e412878d8957
Author: Jordan Justen <jordan.l.justen at intel.com>
Date:   Sat Jan 25 02:17:21 2014 -0800

    glsl/linker: produce gl_shader_program Geom.Invocations
    
    Grab the parsed invocation count, check for consistency
    during linking, and finally save the result in
    gl_shader_program Geom.Invocations.
    
    Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>
    Reviewed-by: Anuj Phogat <anuj.phogat at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=02dc74fbd72d82a21506a5984a92e5db08fcfc5c
Author: Jordan Justen <jordan.l.justen at intel.com>
Date:   Sun Feb 2 17:55:36 2014 -0800

    glsl: parse invocations layout qualifier for ARB_gpu_shader5
    
    _mesa_glsl_parse_state in_qualifier->invocations will store the
    invocations count.
    
    v3:
     * Use in_qualifier to allow the primitive to be specied
       separately from the invocations count (merge_qualifiers)
    
    Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
    Reviewed-by: Anuj Phogat <anuj.phogat at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=738c9c3c543b985b025a4a60fcc9c2e212e2d821
Author: Jordan Justen <jordan.l.justen at intel.com>
Date:   Tue Feb 4 11:32:56 2014 -0800

    glsl: Generate error for invalid input layout declarations
    
    Fixes various piglit tests:
    spec/glsl-1.50/compiler/incorrect-in-layout-qualifier-*.geom
    
    Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
    Reviewed-by: Anuj Phogat <anuj.phogat at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=0c558f9ee6cfc412037dc56ad4c3686e0f116852
Author: Jordan Justen <jordan.l.justen at intel.com>
Date:   Sun Feb 2 17:49:15 2014 -0800

    glsl: convert GS input primitive to use ast_type_qualifier
    
    We introduce a new merge_in_qualifier ast_type_qualifier
    which allows specialized handling of merging input layout
    qualifiers.
    
    By merging layout qualifiers into state->in_qualifier, we
    allow multiple input qualifiers. For example, the primitive
    type can be specified specified separately from the
    invocations count (ARB_gpu_shader5).
    
    state->gs_input_prim_type is moved into state->in_qualifier->prim_type
    
    state->gs_input_prim_type_specified is still processed separately
    so we can determine when the input primitive is specified. This
    is important since certain scenerios are not supported until after
    the primitive type has been specified in the shader code.
    
    v4:
     * Merge with compute shader input layout qualifiers
    
    Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
    Reviewed-by: Anuj Phogat <anuj.phogat at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=5bc0b2f4321fe623e37535aa1ff1848aa5a2dec1
Author: Eric Anholt <eric at anholt.net>
Date:   Thu Feb 20 09:51:23 2014 -0800

    i965: Fix extra return value after winsys rb update refactor.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75172
    Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=9245206cbfaaa4e18c1f3715eebb5f281070d772
Author: Eric Anholt <eric at anholt.net>
Date:   Fri Feb 14 16:06:31 2014 -0800

    i965/vs: Use samplers for UBOs in the VS like we do for non-UBO pulls.
    
    Improves performance of a dolphin emulator trace I had laying around by
    3.60131% +/- 0.995887% (n=128).
    
    Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=9e3cab8881626edd72d222f35c5d2a5fd9661bce
Author: Eric Anholt <eric at anholt.net>
Date:   Fri Feb 14 15:29:01 2014 -0800

    i965/fs: Add an optimization pass to remove redundant flags movs.
    
    We generate steaming piles of these for the centroid workaround, and this
    quickly cleans them up.
    
    total instructions in shared programs: 1591228 -> 1590047 (-0.07%)
    instructions in affected programs:     26111 -> 24930 (-4.52%)
    GAINED:                                0
    LOST:                                  0
    
    (Improved apps are l4d2, csgo, and dolphin)
    
    Reviewed-by: Matt Turner <mattst88 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=b2b2a2c06c20f3ca592af6e96222deab67ea239c
Author: Roland Scheidegger <sroland at vmware.com>
Date:   Thu Feb 20 03:09:17 2014 +0100

    gallivm: add smallfloat to float conversion not relying on cpu denorm handling
    
    The previous code relied on cpu denorm support for converting small float
    formats (such r11g11b10_float and r16_float) to floats, otherwise denorms
    are flushed to zero. We worked around that in llvmpipe blend code by
    reenabling denorms, but this did nothing for texture sampling. Now it would
    be possible to reenable it there too but I'm not really a fan of messing
    with fpu flags (and it seems we can't actually do it reliably with llvm in
    any case looking at some bug reports). (Not to mention if you actually have
    a lot of denorms in there, you can expect some order-of-magnitude slowdown
    with x86 cpus.)
    So instead use code which adjusts exponents etc. directly hence not relying
    on cpu denorm support for the rescaling mul.
    (We still need the fpu flag handling as we can't do float-to-smallfloat
    without using cpu denorms at least for now - I actually wanted to keep
    both the old and new code and using one or the other depending on from where
    it's called but that didn't work out as the parameter would have to be passed
    through too many layers than I'd like.)
    
    Reviewed-by: Zack Rusin <zackr at vmware.com>
    Reviewed-by: Si Chen <sichen at vmware.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=0206f0b3d4923411036711d9e7b31e33cd793a4e
Author: Leo Liu <leo.liu at amd.com>
Date:   Wed Feb 19 12:17:51 2014 -0500

    st/omx/enc: add multi scaling buffers for performance improvement
    
    Signed-off-by: Leo Liu <leo.liu at amd.com>
    Reviewed-by: Christian König <christian.koenig at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=754fa3a0d22596eab4991b7b5dd7cf383bd2f138
Author: Christian König <christian.koenig at amd.com>
Date:   Wed Feb 19 18:49:17 2014 +0100

    st/omx/dec/h264: fix prevFrameNumOffset handling
    
    Signed-off-by: Christian König <christian.koenig at amd.com>




More information about the mesa-commit mailing list