[Mesa-dev] [PATCH 8/8] i965: Enable EXT_shader_samples_identical
Jason Ekstrand
jason at jlekstrand.net
Thu Nov 19 13:50:54 PST 2015
On Wed, Nov 18, 2015 at 9:29 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>
> On Nov 18, 2015 5:02 PM, "Jason Ekstrand" <jason at jlekstrand.net> wrote:
>>
>> On Wed, Nov 18, 2015 at 4:06 PM, Kenneth Graunke <kenneth at whitecape.org>
>> wrote:
>> > On Wednesday, November 18, 2015 03:46:54 PM Ian Romanick wrote:
>> >> From: Ian Romanick <ian.d.romanick at intel.com>
>> >>
>> >> Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
>> >> ---
>> >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 1 +
>> >> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 16 ++++++++++++++++
>> >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 1 +
>> >> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +++++++++++
>> >> src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
>> >> 5 files changed, 30 insertions(+)
>> >>
>> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> >> index 1f71f66..4af1234 100644
>> >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> >> @@ -2550,6 +2550,7 @@ fs_visitor::nir_emit_texture(const fs_builder
>> >> &bld, nir_tex_instr *instr)
>> >> switch (instr->op) {
>> >> case nir_texop_txf:
>> >> case nir_texop_txf_ms:
>> >> + case nir_texop_samples_identical:
>> >> coordinate = retype(src, BRW_REGISTER_TYPE_D);
>> >> break;
>> >> default:
>> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> >> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> >> index a7bd9ce..6688f6a 100644
>> >> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> >> @@ -259,6 +259,22 @@ fs_visitor::emit_texture(ir_texture_opcode op,
>> >> lod = fs_reg(0u);
>> >> }
>> >>
>> >> + if (op == ir_samples_identical) {
>> >> + fs_reg dst = vgrf(glsl_type::get_instance(dest_type->base_type,
>> >> 1, 1));
>> >> +
>> >> + if (mcs.file == BRW_IMMEDIATE_VALUE) {
>> >> + fs_reg tmp = vgrf(glsl_type::uint_type);
>> >> +
>> >> + bld.MOV(tmp, mcs);
>> >> + bld.CMP(dst, tmp, src_reg(0u), BRW_CONDITIONAL_EQ);
>> >
>> > Seems a little strange to emit assembly to do the comparison when
>> > you've already determined that the value is a compile time constant.
>> >
>> > Why not just:
>> >
>> > bld.MOV(dst, fs_reg(mcs.ud == 0u ? ~0u : 0u));
>>
>> Actually, getting an immediate here means we don't have an MCS and we
>> have no idea of the samples are identical, so we should return false
>> always.
>>
>> >> + } else {
>> >> + bld.CMP(dst, mcs, src_reg(0u), BRW_CONDITIONAL_EQ);
>>
>> We should also consider handling the clear color case. In this case,
>> we'll get 0xff for 2x and 0xffffffff for 4x or 8x. Do we know the
>> number of samples in the shader? We should be able to get that from
>> the sampler or something but then we would have to pass that through
>> the key and that would get gross.
>
> First off, I realized that the numbers I have there are wrong. It's 0xff for
> 2x and 4x and 0xffffffff for 8x. However, I also just realized that the 8
> 8-bit values you get for 8x MSAA range from 0 to 7 but take up 4 bits each.
> This means that no valid 8x MSAA MCS value can have 0xff as its bottom 8
> bits unless it's the clear color. This means that a simple and with 0xff
> will get us a clear color check on all but 16x. Unfortunately, 16x has a
> 64-bit MCS value and, unless the hardware provides is with some extra
> guarantees, 0xff would be valid in the bottom 8 bits.
>
> Going off into the world of speculation just a bit, can we make some
> assumptions about the hardware? Suppose for a moment that the used a fairly
> greedy algorithm for determining which plane to store a value in:
>
> 1) If all samples are affected, store in slice zero
> 2) If not, store in the first available empty or completely overwritten
> slice.
>
> Such an algorithm would make sense and have the nice property of tending to
> pack the samples in the earlier slices this decreasing the possibility of
> ever touching slice 15. This is good for cache locality. It also has
> another property that would be very useful for us, namely that it only
> touches slice 15 if all 16 samples have different colors. In particular, it
> would mean that you can never have two samples that both lie in slice 15
> and, more specifically, 0xff would also be invalid for 16x.
>
> Unfortunately, that's entirely based on my speculation as to how the
> hardware works. Given that we don't actually know, it's not documented, and
> that we're not liable to ever find anyone willing to give us those kinds of
> details, we're not likely to find out without a very clever experiment.
So, chad has convinced my that my speculation is quite possibly bogus
and *very* hard to actually test, so just checking the bottom 8 bits
probably won't work.
Something else that came out of that conversation is that, for 2x
MSAA, we may get bogus data in all but the bottom 4 bits. In other
words, just blindly checking for zero is probably a bad idea. It'll
work because the extension spec lets us return false negatives, but it
isn't a good idea in general. If we really want the implementation to
be solid, we need to mask off all but the bottom n * log2(n) bits
where n = number of samples.
> OK, enough hardware speculation for one night...
> --Jason
>
>> One other thought, 16x MSAA will break all this because it gives you a
>> ivec4 value from the MCS (if I remember correctly). Not sure if we've
>> landed 16x MSAA yet though.
>> --Jason
>>
>> >> + }
>> >> +
>> >> + this->result = dst;
>> >> + return;
>> >> + }
>> >> +
>> >> if (coordinate.file != BAD_FILE) {
>> >> /* FINISHME: Texture coordinate rescaling doesn't work with
>> >> non-constant
>> >> * samplers. This should only be a problem with GL_CLAMP on
>> >> Gen7.
>> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> >> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> >> index 3c2674d..41c3c10 100644
>> >> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> >> @@ -1615,6 +1615,7 @@ vec4_visitor::nir_emit_texture(nir_tex_instr
>> >> *instr)
>> >> switch (instr->op) {
>> >> case nir_texop_txf:
>> >> case nir_texop_txf_ms:
>> >> + case nir_texop_samples_identical:
>> >> coordinate = get_nir_src(instr->src[i].src,
>> >> BRW_REGISTER_TYPE_D,
>> >> src_size);
>> >> coord_type = glsl_type::ivec(src_size);
>> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> >> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> >> index fda3d7c..2190a86 100644
>> >> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> >> @@ -909,6 +909,17 @@ vec4_visitor::emit_texture(ir_texture_opcode op,
>> >> unreachable("TXB is not valid for vertex shaders.");
>> >> case ir_lod:
>> >> unreachable("LOD is not valid for vertex shaders.");
>> >> + case ir_samples_identical: {
>> >> + if (mcs.file == BRW_IMMEDIATE_VALUE) {
>> >> + const src_reg temp = src_reg(this, glsl_type::uint_type);
>> >> +
>> >> + emit(MOV(dst_reg(temp), mcs));
>> >> + emit(CMP(dest, temp, src_reg(0u), BRW_CONDITIONAL_EQ));
>> >
>> > Ditto.
>> >
>> > bld.MOV(dst, src_reg(mcs.ud == 0u ? ~0u : 0u));
>> >
>> >> + } else {
>> >> + emit(CMP(dest, mcs, src_reg(0u), BRW_CONDITIONAL_EQ));
>> >> + }
>> >> + return;
>> >> + }
>> >> default:
>> >> unreachable("Unrecognized tex op");
>> >> }
>> >> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
>> >> b/src/mesa/drivers/dri/i965/intel_extensions.c
>> >> index 386b63c..2e2459c 100644
>> >> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
>> >> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
>> >> @@ -333,6 +333,7 @@ intelInitExtensions(struct gl_context *ctx)
>> >> ctx->Extensions.ARB_texture_compression_bptc = true;
>> >> ctx->Extensions.ARB_texture_view = true;
>> >> ctx->Extensions.ARB_shader_storage_buffer_object = true;
>> >> + ctx->Extensions.EXT_shader_samples_identical = true;
>> >>
>> >> if (can_do_pipelined_register_writes(brw)) {
>> >> ctx->Extensions.ARB_draw_indirect = true;
>> >>
>> >
>> > _______________________________________________
>> > mesa-dev mailing list
>> > mesa-dev at lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> >
More information about the mesa-dev
mailing list