Mesa (master): 26 new commits

Mon Dec 12 08:09:46 UTC 2016

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ec0a0a60cc2773624f6c72b11c4d37519397a59d
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Wed Nov 30 12:26:49 2016 +0100

    radeonsi: shrink the GSVS ring to account for the reduced item sizes
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6fdef7d26569c1c8bfebcd5d16749ef094b01982
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Wed Nov 30 12:25:45 2016 +0100

    radeonsi: shrink each vertex stream to the actually required size
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=2f2e941e2d9d6155e0661f452343e7a80f2439c4
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 17:41:59 2016 +0100

    radeonsi: use a single descriptor for the GSVS ring
    
    We can hardcode all of the fields for swizzling in the geometry shader.
    
    The advantage is that we use fewer descriptor slots and we no longer have to
    update any of the (ring) descriptors when the geometry shader changes.
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=18616e7551fcecb9445597d78446df6e1df98fbb
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Wed Nov 30 11:33:25 2016 +0100

    radeonsi: pack GS output components for each vertex stream contiguously
    
    Note that the memory layout of one vertex stream inside one "item" (= memory
    written by one GS wave) on the GSVS ring is:
    
      t0v0c0 ... t15v0c0 t0v1c0 ... t15v1c0 ... t0vLc0 ... t15vLc0
      t0v0c1 ... t15v0c1 t0v1c1 ... t15v1c1 ... t0vLc1 ... t15vLc1
                            ...
      t0v0cL ... t15v0cL t0v1cL ... t15v1cL ... t0vLcL ... t15vLcL
      t16v0c0 ... t31v0c0 t16v1c0 ... t31v1c0 ... t16vLc0 ... t31vLc0
      t16v0c1 ... t31v0c1 t16v1c1 ... t31v1c1 ... t16vLc1 ... t31vLc1
                            ...
      t16v0cL ... t31v0cL t16v1cL ... t31v1cL ... t16vLcL ... t31vLcL
    
                            ...
    
      t48v0c0 ... t63v0c0 t48v1c0 ... t63v1c0 ... t48vLc0 ... t63vLc0
      t48v0c1 ... t63v0c1 t48v1c1 ... t63v1c1 ... t48vLc1 ... t63vLc1
                            ...
      t48v0cL ... t63v0cL t48v1cL ... t63v1cL ... t48vLcL ... t63vLcL
    
    where tNN indicates the thread number, vNN the vertex number (in the order of
    EMIT_VERTEX), and cNN the output component (vL and cL are the last vertex and
    component, respectively).
    
    The vertex streams are laid out sequentially.
    
    The swizzling by 16 threads is hard-coded in the way the VGT generates the
    offset passed into the GS copy shader, and the jump every 16 threads is
    calculated from VGT_GSVS_RING_OFFSET_n and VGT_GSVS_RING_ITEMSIZE in a way
    that makes it difficult to deviate from this layout (at least that's what
    I've experimentally confirmed on VI after first trying to go the simpler
    route of just interleaving the vertex streams).
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=edf034ac142f2ae10befdf331b170373ff456495
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Fri Dec 2 21:20:49 2016 +0100

    radeonsi: do not write non-existent components through the GSVS ring
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=af976f12a56d11face02fe74ef0f112ec26d4c69
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 16:27:10 2016 +0100

    radeonsi: only write values belonging to the stream when emitting GS vertex
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=bdf1bf1cb5422a944205ea30b2eb203a73bdd736
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 16:25:21 2016 +0100

    radeonsi: generate an explicit switch instruction over vertex streams
    
    SimplifyCFG generates a switch instruction anyway when all four streams
    are present, but is simultaneously not smart enough to eliminate some
    redundant jumps that it generates.
    
    The generated assembly is still a bit silly, probably because the
    control flow annotation doesn't know how to handle a switch with uniform
    condition.
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=bae929f96ee57ec55d15fae87bf80c45a8bd7e4d
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 16:03:56 2016 +0100

    radeonsi: fetch only outputs of current vertex stream from the GSVS ring
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=dfb69cac3354ecb338987b02f635de0bfdcf37d2
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 15:55:00 2016 +0100

    radeonsi: only export from GS copy shader for vertex stream 0
    
    When running the copy shader for vertex streams != 0, the SX does not need
    any data from us (there is no rasterization for the higher vertex streams,
    only streamout).
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=21f2bb22a3077dce5ce8e93a0bebc9a9b7fdb82d
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 15:53:19 2016 +0100

    radeonsi: do not export VS outputs from vertex streams != 0
    
    This affects for GS copy shaders. When an output is meant for vertex
    stream != 0, then we don't have to make it available to the pixel
    shader.
    
    There is a minor inefficiency here because the GLSL varying packing pass
    does not group varyings of the same vertex stream together, but it
    shouldn't be important in practice.
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=fc0e009aa7c4b865f4fbd3a46a1cd5259f121f0e
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 14:23:05 2016 +0100

    radeonsi: pull iteration over vertex streams into GS copy shader logic
    
    The iteration is not needed for normal vertex shaders.
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=180ae18ec591822ff05b0c141b15c801dccd0c56
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 15:36:08 2016 +0100

    radeonsi: group streamout writes by vertex stream
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=d89592836ab90bda998f0739ab1a5c8dbee36cb7
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 15:09:14 2016 +0100

    radeonsi: load the streamout buf descriptors closer to their use
    
    LLVM can still decide to hoist the loads since they're marked invariant.
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=564f17f0d7c9faf683148c8d178ceadc4fb9c1aa
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 13:23:21 2016 +0100

    radeonsi: extract writing of a single streamout output
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=b41dd002354af06208056ad435a4ab6f0052b4c2
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 12:59:48 2016 +0100

    radeonsi: separate the call to si_llvm_emit_streamout from exports
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=5ad6e56ca321669f124b10ad2889d5ff33bf0050
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 15:43:44 2016 +0100

    radeonsi: plumb the output vertex_stream through to si_shader_output_values
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=2985708fa07e54cbc854d3ef7a9b3227d0e6db5c
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 12:55:19 2016 +0100

    radeonsi: rename members of si_shader_output_values
    
    Be a bit more verbose and avoid confusion in future patches.
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=88509518b01d7c1d7436a790bf9be5cf3c41a528
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Dec 6 21:03:03 2016 +0100

    radeonsi: fix an off-by-one error in the bounds check for max_vertices
    
    The spec actually says that calling EmitStreamVertex is undefined when
    you exceed max_vertices. But we do need to avoid trampling over memory
    outside the GSVS ring.
    
    Cc: mesa-stable at lists.freedesktop.org
    Reviewed-by: Edward O'Callaghan <funfunctor at folklore1984.net>
    Reviewed-by: Michel Dänzer <michel.daenzer at amd.com>
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=7655bccce80c9690ecb850304d15238ef1e0d622
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 16:33:31 2016 +0100

    radeonsi: do not kill GS with memory writes
    
    Vertex emits beyond the specified maximum number of vertices are supposed to
    have no effect, which is why we used to always kill GS that reached the limit.
    
    However, if the GS also writes to memory (SSBO, atomics, shader images), then
    we must keep going and only skip the vertex emit itself.
    
    Cc: mesa-stable at lists.freedesktop.org
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=7b5b3d63c5f33bbd49f4b11c282603baa9371c10
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Mon Nov 28 20:30:41 2016 +0100

    radeonsi: update all GSVS ring descriptors for new buffer allocations
    
    Fixes GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_geometry_instanced.
    
    Cc: mesa-stable at lists.freedesktop.org
    Reviewed-by: Edward O'Callaghan <funfunctor at folklore1984.net>
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=2eaacba7f28da49c2d248fa2df9feeca32f3480c
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 12:38:48 2016 +0100

    st/glsl_to_tgsi: plumb the GS output stream qualifier through to TGSI
    
    Allow drivers to emit GS outputs in a smarter way.
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=cc34a6f0bd1e1c01ef8eb4f7be2bc8bde859ca1f
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Fri Dec 2 21:20:16 2016 +0100

    tgsi/scan: collect information about output usagemasks
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=cf8e9778fc784e7bd923b89351f0a551570cd8d2
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 14:22:20 2016 +0100

    tgsi/scan: collect information about output vertex streams
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=81d0dc5e551fdc7da4cef6be482f8d2ce78f6999
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Tue Nov 29 14:21:55 2016 +0100

    gallium: extract individual streamout output structure
    
    So that we can pass pointers to individual array entries around.
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=04811354c87e40b0bd5e970fa413ea056ed94173
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Wed Dec 7 11:27:25 2016 +0100

    tgsi: add Stream{X,Y,Z,W} fields to tgsi_declaration_semantic
    
    This is for geometry shader outputs. Without it, drivers have no way of
    knowing which stream each output is intended for, and have to
    conservatively write all outputs to all streams.
    
    Separate stream numbers for each component are required due to output
    packing.
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=173d80b40159669b303ea19e8b6abd24d7fce39b
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date:   Wed Nov 30 10:38:55 2016 +0100

    glsl: remember per-component vertex streams for packed varyings
    
    Reviewed-by: Marek Olšák <marek.olsak at amd.com>