[Mesa-stable] [Mesa-dev] [PATCH] intel/fs: Use a pure vertical stride for large register strides
Andres Gomez
agomez at igalia.com
Sat Nov 11 01:06:37 UTC 2017
On Fri, 2017-11-10 at 16:26 -0800, Jason Ekstrand wrote:
> On Fri, Nov 10, 2017 at 4:12 PM, Andres Gomez <agomez at igalia.com>
> wrote:
> > Jason, having this into account, I'll leave this patch out of 17.2
> > so
> > far we don't have another one that fixes this regression (?)
>
> This patch doesn't regress anything, it just isn't sufficient to fix
> the bug on little-core.
OK, so I clearly misunderstood Matt's message.
Thanks for the prompt answer! ☺
>
> --Jason
>
> > I noticed that the patch bisected by Mark is a different one so I'm
> > not
> > sure I'm understanding the status, though.
> >
> > Let me know what you think.
> >
> > On Thu, 2017-11-09 at 17:01 -0800, Jason Ekstrand wrote:
> > > On Thu, Nov 9, 2017 at 2:23 PM, Matt Turner <mattst88 at gmail.com>
> > wrote:
> > > > On Thu, Nov 2, 2017 at 3:54 PM, Jason Ekstrand <jason at jlekstran
> > d.net> wrote:
> > > > > Register strides higher than 4 are uncommon but they can
> > happen. For
> > > > > instance, if you have a 64-bit extract_u8 operation, we turn
> > that into
> > > > > UB -> UQ MOV with a source stride of 8. Our previous
> > calculation would
> > > > > try to generate a stride of <32;8,8>:ub which is invalid
> > because the
> > > > > maximum horizontal stride is 4. To solve this problem, we
> > instead use a
> > > > > stride of <8;1,0>. As noted in the comment, this does not
> > work as a
> > > > > destination but that's ok as very few things actually
> > generate that
> > > > > stride.
> > > >
> > > > Please put the tests you fixed in the commit message. It's not
> > okay to
> > > > leave that out for all the reasons that I'm sure you know.
> > >
> > > I didn't because the test passes before and after the patch. I
> > guess I could have included that information though.
> > >
> > > > Looks like this doesn't work on CHV, BXT, GLK :(
> > > >
> > > > KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks now fails on
> > CHV,
> > > > BXT, GLK with:
> > > >
> > > > mov(8) g21<1>UQ g19<8,1,0>UB
> > { align1 1Q };
> > > > ERROR: Source and destination horizontal stride must
> > equal and
> > > > a multiple of a qword when the execution type is 64-bit
> > > > ERROR: Vstride must be Width * Hstride when the
> > execution type is 64-bit
> > > >
> > > > Modulo the typo in the first error, I think both of these are
> > correct.
> > > > I don't think we can extract_u8 to a 64-bit type on Atom :(
> > >
> > > That's unfortunate... Quickly racking my brain, I don't see a
> > slick way to implement that opcode. How would you feel about some
> > late opt_algebraic lowering?
> > >
> > > > This is filed as https://bugs.freedesktop.org/show_bug.cgi?id=1
> > 03628
> > >
> > > _______________________________________________
> > > mesa-stable mailing list
> > > mesa-stable at lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-stable
> > --
> > Br,
> >
> > Andres
>
> _______________________________________________
> mesa-stable mailing list
> mesa-stable at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable
--
Br,
Andres
More information about the mesa-stable
mailing list