[PATCH i-g-t] lib/rendercopy: Fix Xe2 pixel shader

Zbigniew Kempczyński zbigniew.kempczynski at intel.com
Fri Mar 1 06:55:05 UTC 2024


On Thu, Feb 29, 2024 at 07:06:43PM +0200, Juha-Pekka Heikkila wrote:
> I was all evening reading the spec and comparing code but I didn't also find
> place which would point this change in spec. I suspect it relate to that
> simd mode change when compare to those older shaders. I agree with the
> change as the code works but I'll still try to figure out where this is
> coming from.

Lionel suggested in offline discussion that there's alternative
- switching from barycentric to pixel positions (subspans are passed
in r1 or r0.10-13(Xe2)). Finally faster was for me to dump and
examine r6 content. Comparing to Xe and previous platforms something
changed how VUE is unpacked to r6.

Right now I'm preparing series with xe_render_copy which exercises
copying using different positions so detecting shader is not fully
correct should be easier.

Thank you for the review.

--
Zbigniew

> 
> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila at gmail.com>
> 
> On 26.2.2024 13.45, Zbigniew Kempczyński wrote:
> > On Xe2 start and coefficient for barycentric positions are passed
> > in different location of grf register (r6). I wasn't able to find
> > this information explicitly in the documentation so I've dumped
> > all registers involved in the operation and deduce r6 values
> > meaning. This means I may be wrong but tests which are previously
> > failing on render-copy now are working fine.
> > 
> > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski at intel.com>
> > Cc: Juha-Pekka Heikkila <juhapekka.heikkila at gmail.com>
> > Cc: Swati Sharma <swati2.sharma at intel.com>
> > ---
> >   lib/i915/shaders/ps/gen20_render_copy.asm | 4 ++--
> >   lib/rendercopy_gen9.c                     | 4 ++--
> >   2 files changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/lib/i915/shaders/ps/gen20_render_copy.asm b/lib/i915/shaders/ps/gen20_render_copy.asm
> > index 330417966d..48057f441e 100644
> > --- a/lib/i915/shaders/ps/gen20_render_copy.asm
> > +++ b/lib/i915/shaders/ps/gen20_render_copy.asm
> > @@ -1,7 +1,7 @@
> >   L0:
> > -(W)     mad (16|M0)              acc0.0<1>:f   r6.3<0;0>:f      r1.0<1;1>:f       r6.0<0>:f
> > +(W)     mad (16|M0)              acc0.0<1>:f   r6.0<0;0>:f      r1.0<1;1>:f       r6.6<0>:f
> >   (W)     mad (16|M0)              r113.0<1>:f   acc0.0<1;1>:f    r1.0<1;1>:f       r6.1<0>:f
> > -(W)     mad (16|M0)              acc0.0<1>:f   r6.7<0;0>:f      r1.0<1;1>:f       r6.4<0>:f
> > +(W)     mad (16|M0)              acc0.0<1>:f   r6.3<0;0>:f      r1.0<1;1>:f       r6.4<0>:f
> >   (W)     mad (16|M0)              r114.0<1>:f   acc0.0<1;1>:f    r2.0<1;1>:f       r6.5<0>:f
> >   (W)     send.smpl (16|M0)        r12      r113  null:0  0x0            0x04420001           {F at 1,$0} // wr:2+0, rd:4; simd16 sample:u+v+r+ai+mlod using sampler index 0
> >   (W)     send.rc (16|M0)          null     r12   null:0  0x0            0x08031400           {EOT,$0} // wr:4+0, rd:0; full-precision render target write SIMD16; last render target to surface 0
> > diff --git a/lib/rendercopy_gen9.c b/lib/rendercopy_gen9.c
> > index a4220d78da..dd5f1dd448 100644
> > --- a/lib/rendercopy_gen9.c
> > +++ b/lib/rendercopy_gen9.c
> > @@ -138,9 +138,9 @@ static const uint32_t gen12p71_render_copy[][4] = {
> >   };
> >   static const uint32_t xe2_render_copy[][4] = {
> > -	{ 0x8010005b, 0x200002a0, 0x020a0634, 0x06040105 },
> > +	{ 0x8010005b, 0x200002a0, 0x020a0604, 0x06640105 },
> >   	{ 0x8010005b, 0x710402a8, 0x020a2001, 0x06140105 },
> > -	{ 0x8010005b, 0x200002a0, 0x020a0674, 0x06440105 },
> > +	{ 0x8010005b, 0x200002a0, 0x020a0634, 0x06440105 },
> >   	{ 0x8010005b, 0x720402a8, 0x020a2001, 0x06540205 },
> >   	{ 0x80122031, 0x0c240000, 0x20027114, 0x00800000 },
> >   	{ 0x8010c031, 0x00000004, 0x58000c24, 0x00c40000 },
> 


More information about the igt-dev mailing list