[Beignet] [PATCH 3/3] GBE: Optimize byte/short load/store using untyped read/write

Song, Ruiling ruiling.song at intel.com
Thu Mar 6 20:52:08 PST 2014


BTW: This patch is for byte/short's vector load/store. Can we also use untype read/write to optimize scalar char/short load/store?
[ruiling]: as this needs very careful and annoying address alignment, I need consider it further.

+    // split a DWORD register into unpacked Byte or Short register
+    static INLINE GenRegister splitReg(GenRegister reg, uint32_t count, uint32_t sub_part) {
+      GenRegister r = reg;
+      GBE_ASSERT(count == 4 || count == 2);
+      if(reg.hstride != GEN_HORIZONTAL_STRIDE_0) {
+        r.hstride = count == 4 ? GEN_HORIZONTAL_STRIDE_4 : 
+ GEN_HORIZONTAL_STRIDE_2;

>>>>>>>>>Do you suppose reg.hstide is GEN_HORIZONTAL_STRIDE_1 here? How about reg.hstide is GEN_HORIZONTAL_STRIDE_2 or GEN_HORIZONTAL_STRIDE_4 case?
[ruiling]: you are right, as splitReg does not consider all combination of register settings, I will add some assert to prevent misuse.


+      }
+      if(count == 4) {
+        r.type = reg.type == GEN_TYPE_UD ? GEN_TYPE_UB : GEN_TYPE_B;
+        r.vstride = GEN_VERTICAL_STRIDE_32;
+      } else {
+        r.type = reg.type == GEN_TYPE_UD ? GEN_TYPE_UW : GEN_TYPE_W;
+        r.vstride = GEN_VERTICAL_STRIDE_16;
+      }
+

+      r.subnr += sub_part*typeSize(r.type);
+      r.nr += r.subnr / 32;
+      r.subnr %= 32;
+
>>>>>>>>>>>>If reg.hstride is GEN_HORIZONTAL_STRIDE_0, should not change r.nr and r.subnr here.
[ruiling]: here I want to get the sub-byte register, like one dword register is composed of [B0 B1 B2 B3], sub_part varies from [0-3] means I want to get B[0-3], so the subnr need to change according to sub_part event it is horizontal_stride_0.


_______________________________________________
Beignet mailing list
Beignet at lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


More information about the Beignet mailing list