[Beignet] [PATCH 00/18] Enable profiling by line number.
He Junyan
junyan.he at inbox.com
Wed Jan 27 21:18:56 PST 2016
Ping for review
On Thu, Dec 24, 2015 at 07:01:52PM +0800, junyan.he at inbox.com wrote:
> Date: Thu, 24 Dec 2015 19:01:52 +0800
> From: junyan.he at inbox.com
> To: beignet at lists.freedesktop.org
> Subject: [PATCH 00/18] Enable profiling by line number.
> X-Mailer: git-send-email 1.7.9.5
>
> From: Junyan He <junyan.he at linux.intel.com>
>
> This patch set will let the user to specify the line numbers in the source
> code to insert the profiling watch points.
> As the first step, we just use the env var OCL_PROFILING_LINES to control
> the kernel name and line numbers. The format is:
> KERNEL_NAME:PROFILING_MODE:LINE_NUMBER0,LINE_NUMBER1,LINE_NUMBER2,...
> for example:
> export OCL_PROFILING_LINES="builtin_atanpi_float8:2:2,6,7,8,15"
> will insert watch points at 2 6 7 8 15 lines in the kernel named builtin_atanpi_float8.
> We have 3 PROFILING_MODE,
> level 1: just brief timestamp with line number.
> Total log number is 6
> Line 2: Timestamp: 190 Thread Exec:6
> Line 6: Timestamp: 1174 Thread Exec:6
> Line 7: Timestamp: 3092 Thread Exec:6
> Line 8: Timestamp: 3105 Thread Exec:6
> Line 15: Timestamp: 3241 Thread Exec:6
>
> level 2: timestamp with source, plus:
> Format: Average Timestamp Exec number Source
> | __kernel void builtin_atanpi_float8(__global float *dst, __global float *src1, __global int *vector) {
> TS: 190 Num: 6 ----> | int i = get_global_id(0);
> | float8 x1 = (float8) (src1[i * (*vector) + 0],src1[i * (*vector) + 1],src1[i * (*vector) + 2],src1[i * (*vector) + 3],src1[i * (*vector) + 4],src1[i * (*vector) + 5],src1[i * (*vector) + 6],src1[i * (*vector) + 7]);
> |
> | float8 ret;
> TS: 1174 Num: 6 ----> | ret = atanpi(x1);
> TS: 3092 Num: 6 ----> | dst[i * (*vector) + 0] = ret[0];
> TS: 3105 Num: 6 ----> | dst[i * (*vector) + 1] = ret[1];
> | dst[i * (*vector) + 2] = ret[2];
> | dst[i * (*vector) + 3] = ret[3];
> | dst[i * (*vector) + 4] = ret[4];
> | dst[i * (*vector) + 5] = ret[5];
> | dst[i * (*vector) + 6] = ret[6];
> | dst[i * (*vector) + 7] = ret[7];
> TS: 3241 Num: 6 ----> | };
>
>
> level 3: output the detail logs, add all logs as:
> ------------------------ Log 0 -----------------------
> | fix functions id: 7 simd: 16 kernel id: 0 |
> | thread id: 0 EU id: 8 sub slice id: 0 slice id 0 |
> | dispatch Mask: 1 prolog: 6860 epilog: 19548 |
> | globalX: 3~ 3 globalY: 0~ 0 globalZ: 0~ 0 |
> | ts0 : 201 | ts1 : 1180 | ts2 : 12417 |
> | ts3 : 12430 | ts4 : 12637 | ts5 : 0 |
> | ts6 : 0 | ts7 : 0 | ts8 : 0 |
> | ts9 : 0 | ts10: 0 | ts11: 0 |
> | ts12: 0 | ts13: 0 | ts14: 0 |
> | ts15: 0 | ts16: 0 | ts17: 0 |
> | ts18: 0 | ts19: 0 | |
> ------------------------ Log 1 -----------------------
> | fix functions id: 7 simd: 16 kernel id: 0 |
> | thread id: 0 EU id: 8 sub slice id: 1 slice id 0 |
> | dispatch Mask: 1 prolog: 6877 epilog: 19569 |
> | globalX: 4~ 4 globalY: 0~ 0 globalZ: 0~ 0 |
> | ts0 : 209 | ts1 : 1190 | ts2 : 12423 |
> | ts3 : 12436 | ts4 : 12643 | ts5 : 0 |
> | ts6 : 0 | ts7 : 0 | ts8 : 0 |
> | ts9 : 0 | ts10: 0 | ts11: 0 |
> | ts12: 0 | ts13: 0 | ts14: 0 |
> | ts15: 0 | ts16: 0 | ts17: 0 |
> | ts18: 0 | ts19: 0 | |
> .....
> .....
>
>
>
> Some problems:
> 1. On BDW, the timestamp sometimes gives invalid huge value.
> It may be a HW issue or feature, we need to check it further.
> 2. Sometimes the line number of instruction is different from the
> source code. This is caused by optimization and we can notice
> and analyse it by Gen IR or ASM. I will send a patch to set
> optimization level later.
> 3. Some line numbers are missing when there are lots of inline
> function call and macro. I will investigate llvm and clang to
> find a better solution.
>
> I will write a detail doc about the profiling later.
>
> With this parch set, please just ignore the previous two patchsets
> about debug info.
>
>
> Signed-off-by: Junyan He <junyan.he at linux.intel.com>
> ---
> backend/src/backend/gen_context.cpp | 20 +-
> backend/src/backend/gen_encoder.cpp | 9 +-
> backend/src/backend/gen_insn_selection.cpp | 10 +-
> backend/src/backend/gen_insn_selection_output.cpp | 3 +
> backend/src/backend/gen_program.cpp | 11 +-
> backend/src/backend/gen_register.hpp | 4 +-
> backend/src/backend/program.cpp | 34 ++--
> backend/src/backend/program.h | 3 +-
> backend/src/backend/program.hpp | 2 +-
> backend/src/ir/context.cpp | 4 +-
> backend/src/ir/function.cpp | 6 +-
> backend/src/ir/function.hpp | 10 +
> backend/src/ir/instruction.cpp | 11 +-
> backend/src/ir/instruction.hpp | 3 +
> backend/src/ir/lowering.cpp | 15 +-
> backend/src/ir/profiling.cpp | 72 +++++++-
> backend/src/ir/profiling.hpp | 35 +++-
> backend/src/ir/structurizer.cpp | 55 ++++--
> backend/src/ir/structurizer.hpp | 6 +-
> backend/src/ir/unit.cpp | 9 +-
> backend/src/ir/unit.hpp | 12 +-
> backend/src/llvm/ExpandConstantExpr.cpp | 6 +-
> backend/src/llvm/ExpandLargeIntegers.cpp | 173 +++++++++--------
> backend/src/llvm/llvm_gen_backend.cpp | 11 +-
> backend/src/llvm/llvm_gen_backend.hpp | 11 +-
> backend/src/llvm/llvm_intrinsic_lowering.cpp | 8 +-
> backend/src/llvm/llvm_passes.cpp | 13 +-
> backend/src/llvm/llvm_profiling.cpp | 215 +++++++++++++++++-----
> backend/src/llvm/llvm_sampler_fix.cpp | 7 +
> backend/src/llvm/llvm_scalarize.cpp | 5 +-
> backend/src/llvm/llvm_to_gen.cpp | 4 +-
> backend/src/llvm/llvm_to_gen.hpp | 2 +-
> src/cl_program.c | 3 +-
> 33 files changed, 535 insertions(+), 257 deletions(-)
More information about the Beignet
mailing list