[Beignet] [PATCH 3/5] add third coord in backend
Zhigang Gong
zhigang.gong at linux.intel.com
Wed May 8 00:33:45 PDT 2013
> -----Original Message-----
> From:
> beignet-bounces+zhigang.gong=linux.intel.com at lists.freedesktop.org
> [mailto:beignet-bounces+zhigang.gong=linux.intel.com at lists.freedesktop.
> org] On Behalf Of Xing, Homer
> Sent: Wednesday, May 08, 2013 2:00 PM
> To: beignet at lists.freedesktop.org
> Subject: Re: [Beignet] [PATCH 3/5] add third coord in backend
>
> > If this is a 2D surface, then we'd better to use another way to identify
it
> rather than allocate a register for the nonexisting coord. It wastes one
> register and it will generates useless instructions in both Sampler and
> TypedWrite. Any thoughts?
>
> Hi, did you remember last week, we discussed the problem that
> TypedWriteInstruction only support two coords, but I want
> TypedWriteInstruction to express both two-coords case and three-coords
> case?
> Finally we agree we'd better reuse TypedWriteInstruction.
>
> Let third coords be non-existing in the 2D case, reuses
> TypedWriteInstruction for both 2D and 3D cases.
> Otherwise I am afraid I have to create TypedWriteInstruction3D for 3D
> case, and leave TypedWriteInstruction for 2D case.
[Gong, Zhigang] No, you don't have to create new functions. Just add one new
parameter to specify the coords' dimension in IR layer, and
specify the type in the extra.function in the gen instruction selection
layer should be good enough.
>
> On the other hand, current solution does not hurt the performance. The
[Gong, Zhigang] It generated one useless instruction, although it does hurt
performance slightly. And it's very easy to fix, why do you want't to do
that?
> wasted register is only in compiling phase, not in Gen7 assembly code.
[Gong, Zhigang] I recheck the code, and you are right at this point, as you
use a hard coded virtual register, it will not generate a new register. It
just
borrow from other place (actually, the special register local id). Besides
it generate one useless instruction, this also make the output instruction
more
confusing, please see the following example:
c source code:
__kernel void
test_copy_image(__read_only image2d_t src, __write_only image2d_t dst,
sampler_t sampler)
{
int2 coord;
int4 color;
coord.x = (int)get_global_id(0);
coord.y = (int)get_global_id(1);
color = read_imagei(src, sampler, coord);
write_imagei(dst, coord, color);
}
Register allocation:
%0 g4.0D
%1 g6.0D
%2 g8.0D
%3 g0.1D
%4 g0.6D
Assembly output:
cmp.le(16) null g1<8,8,1>UW 0x0000UW { align1
WE_normal 1H };
(+f0) mov(16) g1<1>UW 0x0001UW { align1
WE_normal 1H };
(+f0) mul(16) g126<1>D g0.6<0,1,0>D g2.4<0,1,0>UD { align1
WE_normal 1H };
(+f0) mul(16) g122<1>D g0.1<0,1,0>D g2.3<0,1,0>UD { align1
WE_normal 1H };
(+f0) add(16) g124<1>D g126<8,8,1>D g6<8,8,1>D { align1
WE_normal 1H };
(+f0) add(16) g120<1>D g122<8,8,1>D g4<8,8,1>D { align1
WE_normal 1H }; # g4 is allocated for the local id 0
(+f0) mov(1) a0<1>UD 0x08840000UD { align1
WE_normal };
(+f0) shl(1) g112.4<1>UD g2.2<0,1,0>UD 0x00000008UD { align1
WE_normal };
(+f0) or(1) a0<1>UD a0<0,1,0>UD g112.4<0,1,0>UD { align1
WE_normal };
(+f0) or(1) a0<1>UD a0<0,1,0>UD g2<0,1,0>UD { align1
WE_normal };
(+f0) mov(16) g112<1>F g120<8,8,1>UD { align1
WE_normal 1H };
(+f0) mov(16) g114<1>F g124<8,8,1>UD { align1
WE_normal 1H };
(+f0) mov(16) g116<1>F g4<8,8,1>UD { align1
WE_normal 1H }; # you borrow g4 here, and use it as the w coord and move
it to the message header, this instruction should be removed.
(+f0) send(16) g104<1>UD g112<8,8,1>F
Any thoughts?
>
> Considering performance and reusing, I think current solution is good.
>
> Homer
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet
More information about the Beignet
mailing list