[Beignet] [PATCH 3/3] GBE: Optimize byte/short load/store using untyped read/write
Song, Ruiling
ruiling.song at intel.com
Thu Mar 6 20:52:08 PST 2014
BTW: This patch is for byte/short's vector load/store. Can we also use untype read/write to optimize scalar char/short load/store?
[ruiling]: as this needs very careful and annoying address alignment, I need consider it further.
+ // split a DWORD register into unpacked Byte or Short register
+ static INLINE GenRegister splitReg(GenRegister reg, uint32_t count, uint32_t sub_part) {
+ GenRegister r = reg;
+ GBE_ASSERT(count == 4 || count == 2);
+ if(reg.hstride != GEN_HORIZONTAL_STRIDE_0) {
+ r.hstride = count == 4 ? GEN_HORIZONTAL_STRIDE_4 :
+ GEN_HORIZONTAL_STRIDE_2;
>>>>>>>>>Do you suppose reg.hstide is GEN_HORIZONTAL_STRIDE_1 here? How about reg.hstide is GEN_HORIZONTAL_STRIDE_2 or GEN_HORIZONTAL_STRIDE_4 case?
[ruiling]: you are right, as splitReg does not consider all combination of register settings, I will add some assert to prevent misuse.
+ }
+ if(count == 4) {
+ r.type = reg.type == GEN_TYPE_UD ? GEN_TYPE_UB : GEN_TYPE_B;
+ r.vstride = GEN_VERTICAL_STRIDE_32;
+ } else {
+ r.type = reg.type == GEN_TYPE_UD ? GEN_TYPE_UW : GEN_TYPE_W;
+ r.vstride = GEN_VERTICAL_STRIDE_16;
+ }
+
+ r.subnr += sub_part*typeSize(r.type);
+ r.nr += r.subnr / 32;
+ r.subnr %= 32;
+
>>>>>>>>>>>>If reg.hstride is GEN_HORIZONTAL_STRIDE_0, should not change r.nr and r.subnr here.
[ruiling]: here I want to get the sub-byte register, like one dword register is composed of [B0 B1 B2 B3], sub_part varies from [0-3] means I want to get B[0-3], so the subnr need to change according to sub_part event it is horizontal_stride_0.
_______________________________________________
Beignet mailing list
Beignet at lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet
More information about the Beignet
mailing list