[Beignet] [V2 PATCH 0/8] Implement double division on BDW

Fri Sep 18 02:58:11 PDT 2015

From: Junyan He <junyan.he at linux.intel.com>

We use the macro:
r0 = 0, r6 = a, r7 = b, r1 = 1

math.eo.f0.0 (4) r8.acc2 r6.noacc r7.noacc 0xE
(-f0.0) if
madm (4) r9.acc3 r0.noacc r6.noacc r8.acc2       // Step(1), q0=a*y0
madm (4) r10.acc4 r1.noacc -r7.noacc r8.acc2     // Step(2), e0=(1-b*y0)
madm (4) r11.acc5 r6.noacc -r7.noacc r9.acc3     // Step(3), r0=a-b*q0
madm (4) r12.acc6 r8.acc2 r10.acc4 r8.acc2       // Step(4), y1=y0+e0*y0
madm (4) r13.acc7 r1.noacc -r7.noacc r12.acc6    // Step(5), e1=(1-b*y1)
madm (4) r8.acc8 r8.acc2 r10.acc4 r12.acc6       // Step(6), y2=y0+e0*y1
madm (4) r9.acc9 r9.acc3 r11.acc5 r12.acc6       // Step(7), q1=q0+r0*y1
madm (4) r12.acc2 r12.acc6 r8.acc8 r13.acc7      // Step(8), y3=y1+e1*y2
madm (4) r11.acc3 r6.noacc -r7.noacc r9.acc9     // Step(9), r1=a-b*q1

madm (4) r8.noacc r9.acc9 r11.acc3 r12.acc2      // Step(10), q=q1+r1*y3
endif

to implement hi precision double division on BDW.

V2:
1. Correct the spelling slips.
2. Fix some bugs for double registers format.
3. Redefine the handle double logic and delete the double support on pre-gen7
4. Declare fp64 extension support on BDW.
5. Consider the uniform case for F64DIV.

With this patch set, the +-*/ is basically OK on BDW platform.
All pre-gen7 platforms will not support double any more.
Conversion and bitcast between double and other types are not OK now.

Signed-off-by: Junyan He <junyan.he at linux.intel.com>
---
backend/src/backend/gen/gen_mesa_disasm.c           | 134 ++++++++++++++++++++----
backend/src/backend/gen75_encoder.hpp               |   4 -
backend/src/backend/gen7_encoder.hpp                |   4 -
backend/src/backend/gen8_context.cpp                | 145 ++++++++++++++++++++++++++
backend/src/backend/gen8_context.hpp                |   2 +
backend/src/backend/gen8_encoder.cpp                | 164 +++++++++++++++++++++++++++++-
backend/src/backend/gen8_encoder.hpp                |  12 ++-
backend/src/backend/gen8_instruction.hpp            |  86 ++++++++++++++++
backend/src/backend/gen_context.cpp                 |   4 +
backend/src/backend/gen_context.hpp                 |   1 +
backend/src/backend/gen_defs.hpp                    |  13 +++
backend/src/backend/gen_encoder.cpp                 |  52 ++--------
backend/src/backend/gen_encoder.hpp                 |   3 +-
backend/src/backend/gen_insn_gen7_schedule_info.hxx |   1 +
backend/src/backend/gen_insn_selection.cpp          |  54 +++++++++-
backend/src/backend/gen_insn_selection.hxx          |   1 +
backend/src/backend/gen_register.hpp                |   6 +-
kernels/compiler_double_4.cl                        |   5 -
kernels/compiler_double_div.cl                      |  11 ++
src/cl_device_id.c                                  |   3 +
src/cl_extensions.c                                 |  21 ++++
src/cl_extensions.h                                 |   2 +
utests/CMakeLists.txt                               |   2 +
utests/compiler_double.cpp                          |   5 +-
utests/compiler_double_4.cpp                        |  40 --------
utests/compiler_double_div.cpp                      |  80 +++++++++++++++
utests/utest_helper.cpp                             |  19 ++++
utests/utest_helper.hpp                             |   3 +