[Mesa-dev] [PATCH 5/8] i965: Add script to gen code for OA counter queries

Tue Feb 28 22:57:05 UTC 2017

On Mon, Feb 27, 2017 at 5:38 PM, Lionel Landwerlin <
lionel.g.landwerlin at intel.com> wrote:
> Hey Rob,
>
> Your series look pretty good. Just a tiny nit on this patch below.
>
> As we've discussed in the office, I think it would be nice to have part of
> this work factored out in src/intel.
> But that can be done later on.
>
> Also, I found it a bit disturbing to have the equations in polish
notation.
> I've given a try at using a more C-like style to it. The result is here :
>
> https://github.com/djdeath/mesa/commit/ebfaf7aee8efe79dc814bc7bfdcb9d
74cec09d3c
>
> It's a tiny bit less python (~20 lines) but adds a dependency on
> python-parsley.
> I also have a script to convert expression from your format to the new
one.

Thanks for prototyping this it interesting to compare. We talked at length
offline but putting my thoughts down here too:

Tbh I'm not sure whether this would be a good direction, considering pros
and cons...

Currently Mesa isn't the only project doing codegen based on these
expression, and the codegen in gputop is probably a bit more involved than
Mesa's where it also generates MathML for the expressions to have something
readable in the UI. We also have an internal 'MDAPI' library which generate
a class/structure based description of these configs. At least considering
that we have fairly decent support for rendering equations in gputop it
could be good compare implementation details e.g. with regards to avoiding
redundant braces considering the precedence of operators if we wanted to go
in this direction. (details can be found in
gputop/scripts/gputop-oa-codegen.py if curious).

The use of RPN wasn't really a choice by me given that the internal MDAPI
XML files that these XML files are generated from themselves have RPN based
equations. There are some minor differences to how raw A/B/C counters are
read but generally I see some value in minimizing divergence from what VPG
maintains internally considering the effort involved in maintaining the
conversion script itself and the risk that it becomes hard to compare the
two where we're seeing problems with particular counters and want to cross
reference what we're doing with what's VPG's driver does on Windows.

For reference; probably the best place to consider maintaining any code to
convert the expression format would be as a patch to
gputop/scripts/mdapi-xml-convert.py, but it should be considered that these
files are shared so I'd have to update these other generators if we made a
backwards-incompatible change:
- kernel config descriptors
- gputop normalization code (almost identical to Mesa)
- gputop MathML equation descriptions
- MDAPI class based description of counters

To my mind RPN is so simple to 'parse' for codegen that I can barely bring
myself to call it parsing - you just tokenize based on spaces and you're
done and then you traverse the operators by pushing tokens/operands to a
stack, when you hit an operator you pop the operands and push the result.
With RPN there's no precedence ambiguity to consider and with the
explicitly typed operations like UDIV vs FDIV there's no operand type
ambiguity either. I'd certainly be nervous of squashing these both into a
single '/' operator and relying on C type conversion rules. The risk that
there are some corner cases where the results won't match the original
equations without more explicit type casting seems quite high with the
current prototype.

The primary use for these expressions is for codegen and so I think machine
readability and lack of ambiguity is much more important than human
readability within the XML files.

RPN seems very well suited here once you're familiar with evaluating RPN as
a series of operand pushes and pops for operations.

For human readability then I think what gputop does is a pretty decent
starting point, being careful with bracketing and supporting mouse overlay
descriptions of the different variables, and nice typography and layouting.

It's subjective to say which is more readable to handle in code but
personally I find embedding a formal grammar for expressions, introducing
the need for operator precedence and type conversion rules to be more of a
mental load. The comparison of line count imho doesn't account for the
density of the grammar being embedded which needs to understood and
maintained across the different codegen scripts. The previous code
effectively duplicated the traversal of the RPN expressions since it barely
seemed worth factoring out the push and pop of tokens that form the parsing
loop. There was also lots of white space between the operator methods.

For now I think it's nice to see this alternative to compare, but don't
think it's the best trade-off to prioritize human readability of the
equations in the xml files over over, unambiguous, machine readability.
Considering the complexity of some of the equations there's just no way
they will ever be practically readable within the xml files and don't
really see much alternative to using a typesetting library like mathjax to
render the equations e.g. for a UI.

Definitely thanks for doing this experiment though!

Br,
- Robert

>
> Regardless, this series is :
>
> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>
> Thanks!
>
> -
> Lionel
>
>
> On 24/02/17 13:58, Robert Bragg wrote:
>>
>> Avoiding lots of error prone boilerplate and easing our ability to add +
>> maintain support for multiple OA performance counter queries for each
>> generation:
>>
>> This adds a python script to generate code for building up
>> performance_queries from the metric sets and counters described in
>> brw_oa_hsw.xml as well as functions to normalize each counter based on
>> the RPN expressions given.
>>
>> Although the XML file currently only includes a single metric set, the
>> code generated assumes there could be many sets.
>>
>> The metrics as described in XML get translated into C structures
>> which are registered in a brw->perfquery.oa_metrics_table hash table
>> keyed by the GUID of the metric set in XML.
>>
>> Signed-off-by: Robert Bragg <robert at sixbynine.org>
>> ---
>> src/mesa/drivers/dri/i965/Makefile.am | 15 +-
>> src/mesa/drivers/dri/i965/Makefile.sources | 2 +
>> src/mesa/drivers/dri/i965/brw_oa.py | 543
>> +++++++++++++++++++++++++++++
>> 3 files changed, 559 insertions(+), 1 deletion(-)
>> create mode 100644 src/mesa/drivers/dri/i965/brw_oa.py
>>
>> diff --git a/src/mesa/drivers/dri/i965/Makefile.am
>> b/src/mesa/drivers/dri/i965/Makefile.am
>> index f87fa67ef8..0130afff5f 100644
>> --- a/src/mesa/drivers/dri/i965/Makefile.am
>> +++ b/src/mesa/drivers/dri/i965/Makefile.am
>> @@ -93,7 +93,9 @@ BUILT_SOURCES = $(i965_compiler_GENERATED_FILES)
>> CLEANFILES = $(BUILT_SOURCES)
>> EXTRA_DIST = \
>> - brw_nir_trig_workarounds.py
>> + brw_nir_trig_workarounds.py \
>> + brw_oa_hsw.xml \
>> + brw_oa.py
>> TEST_LIBS = \
>> libi965_compiler.la \
>> @@ -169,3 +171,14 @@ test_eu_validate_SOURCES = \
>> test_eu_validate_LDADD = \
>> $(top_builddir)/src/gtest/libgtest.la \
>> $(TEST_LIBS)
>> +
>> +BUILT_SOURCES = \
>> + brw_oa_hsw.h \
>> + brw_oa_hsw.c
>> +
>> +brw_oa_hsw.h brw_oa_hsw.c: brw_oa_hsw.xml brw_oa.py Makefile
>> + $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/brw_oa.py \
>> + --header=$(builddir)/brw_oa_hsw.h \
>> + --code=$(builddir)/brw_oa_hsw.c \
>> + --chipset="hsw" \
>> + $(srcdir)/brw_oa_hsw.xml
>> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources
>> b/src/mesa/drivers/dri/i965/Makefile.sources
>> index 5278e86339..60acd15d41 100644
>> --- a/src/mesa/drivers/dri/i965/Makefile.sources
>> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
>> @@ -135,6 +135,8 @@ i965_FILES = \
>> brw_nir_uniforms.cpp \
>> brw_object_purgeable.c \
>> brw_pipe_control.c \
>> + brw_oa_hsw.h \
>> + brw_oa_hsw.c \
>> brw_performance_query.h \
>> brw_performance_query.c \
>> brw_program.c \
>> diff --git a/src/mesa/drivers/dri/i965/brw_oa.py
>> b/src/mesa/drivers/dri/i965/brw_oa.py
>> new file mode 100644
>> index 0000000000..2c622531af
>> --- /dev/null
>> +++ b/src/mesa/drivers/dri/i965/brw_oa.py
>> @@ -0,0 +1,543 @@
>> +#!/usr/bin/env python2
>> +#
>> +# Copyright (c) 2015 Intel Corporation
>> +#
>> +# Permission is hereby granted, free of charge, to any person obtaining
a
>> +# copy of this software and associated documentation files (the
>> "Software"),
>> +# to deal in the Software without restriction, including without
>> limitation
>> +# the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> +# and/or sell copies of the Software, and to permit persons to whom the
>> +# Software is furnished to do so, subject to the following conditions:
>> +#
>> +# The above copyright notice and this permission notice (including the
>> next
>> +# paragraph) shall be included in all copies or substantial portions of
>> the
>> +# Software.
>> +#
>> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS
>> OR
>> +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
>> SHALL
>> +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> OTHER
>> +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING
>> +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> DEALINGS
>> +# IN THE SOFTWARE.
>> +
>> +import xml.etree.ElementTree as ET
>> +import argparse
>> +import sys
>> +
>> +def print_err(*args):
>> + sys.stderr.write(' '.join(map(str,args)) + '\n')
>> +
>> +c_file = None
>> +_c_indent = 0
>> +
>> +def c(*args):
>> + if c_file:
>> + code = ' '.join(map(str,args))
>> + for line in code.splitlines():
>> + text = ''.rjust(_c_indent) + line
>> + c_file.write(text.rstrip() + "\n")
>> +
>> +# indented, but no trailing newline...
>> +def c_line_start(code):
>> + if c_file:
>> + c_file.write(''.rjust(_c_indent) + code)
>> +def c_raw(code):
>> + if c_file:
>> + c_file.write(code)
>> +
>> +def c_indent(n):
>> + global _c_indent
>> + _c_indent = _c_indent + n
>> +def c_outdent(n):
>> + global _c_indent
>> + _c_indent = _c_indent - n
>> +
>> +header_file = None
>> +_h_indent = 0
>> +
>> +def h(*args):
>> + if header_file:
>> + code = ' '.join(map(str,args))
>> + for line in code.splitlines():
>> + text = ''.rjust(_h_indent) + line
>> + header_file.write(text.rstrip() + "\n")
>> +
>> +def h_indent(n):
>> + global _c_indent
>> + _h_indent = _h_indent + n
>> +def h_outdent(n):
>> + global _c_indent
>> + _h_indent = _h_indent - n
>> +
>> +
>> +def emit_fadd(tmp_id, args):
>> + c("double tmp" + str(tmp_id) +" = " + args[1] + " + " + args[0] +
>> ";")
>> + return tmp_id + 1
>> +
>> +# Be careful to check for divide by zero...
>> +def emit_fdiv(tmp_id, args):
>> + c("double tmp" + str(tmp_id) +" = " + args[1] + ";")
>> + c("double tmp" + str(tmp_id + 1) +" = " + args[0] + ";")
>> + c("double tmp" + str(tmp_id + 2) +" = tmp" + str(tmp_id + 1) + " ?
>> tmp" + str(tmp_id) + " / tmp" + str(tmp_id + 1) + " : 0;")
>> + return tmp_id + 3
>> +
>> +def emit_fmax(tmp_id, args):
>> + c("double tmp" + str(tmp_id) +" = " + args[1] + ";")
>> + c("double tmp" + str(tmp_id + 1) +" = " + args[0] + ";")
>> + c("double tmp" + str(tmp_id + 2) +" = MAX(tmp" + str(tmp_id) + ",
>> tmp" + str(tmp_id + 1) + ");")
>> + return tmp_id + 3
>> +
>> +def emit_fmul(tmp_id, args):
>> + c("double tmp" + str(tmp_id) +" = " + args[1] + " * " + args[0] +
>> ";")
>> + return tmp_id + 1
>> +
>> +def emit_fsub(tmp_id, args):
>> + c("double tmp" + str(tmp_id) +" = " + args[1] + " - " + args[0] +
>> ";")
>> + return tmp_id + 1
>> +
>> +def emit_read(tmp_id, args):
>> + type = args[1].lower()
>> + c("uint64_t tmp" + str(tmp_id) + " = accumulator[query->" + type +
>> "_offset + " + args[0] + "];")
>> + return tmp_id + 1
>> +
>> +def emit_uadd(tmp_id, args):
>> + c("uint64_t tmp" + str(tmp_id) +" = " + args[1] + " + " + args[0] +
>> ";")
>> + return tmp_id + 1
>> +
>> +# Be careful to check for divide by zero...
>> +def emit_udiv(tmp_id, args):
>> + c("uint64_t tmp" + str(tmp_id) +" = " + args[1] + ";")
>> + c("uint64_t tmp" + str(tmp_id + 1) +" = " + args[0] + ";")
>> + c("uint64_t tmp" + str(tmp_id + 2) +" = tmp" + str(tmp_id + 1) + " ?
>> tmp" + str(tmp_id) + " / tmp" + str(tmp_id + 1) + " : 0;")
>> + return tmp_id + 3
>> +
>> +def emit_umul(tmp_id, args):
>> + c("uint64_t tmp" + str(tmp_id) +" = " + args[1] + " * " + args[0] +
>> ";")
>> + return tmp_id + 1
>> +
>> +def emit_usub(tmp_id, args):
>> + c("uint64_t tmp" + str(tmp_id) +" = " + args[1] + " - " + args[0] +
>> ";")
>> + return tmp_id + 1
>> +
>> +def emit_umin(tmp_id, args):
>> + c("uint64_t tmp" + str(tmp_id) +" = MIN(" + args[1] + ", " + args[0]
>> + ");")
>> + return tmp_id + 1
>> +
>> +ops = {}
>> +# (n operands, emitter)
>> +ops["FADD"] = (2, emit_fadd)
>> +ops["FDIV"] = (2, emit_fdiv)
>> +ops["FMAX"] = (2, emit_fmax)
>> +ops["FMUL"] = (2, emit_fmul)
>> +ops["FSUB"] = (2, emit_fsub)
>> +ops["READ"] = (2, emit_read)
>> +ops["UADD"] = (2, emit_uadd)
>> +ops["UDIV"] = (2, emit_udiv)
>> +ops["UMUL"] = (2, emit_umul)
>> +ops["USUB"] = (2, emit_usub)
>> +ops["UMIN"] = (2, emit_umin)
>> +
>> +def brkt(subexp):
>> + if " " in subexp:
>> + return "(" + subexp + ")"
>> + else:
>> + return subexp
>> +
>> +def splice_bitwise_and(args):
>> + return brkt(args[1]) + " & " + brkt(args[0])
>> +
>> +def splice_logical_and(args):
>> + return brkt(args[1]) + " && " + brkt(args[0])
>> +
>> +def splice_ult(args):
>> + return brkt(args[1]) + " < " + brkt(args[0])
>> +
>> +def splice_ugte(args):
>> + return brkt(args[1]) + " >= " + brkt(args[0])
>> +
>> +exp_ops = {}
>> +# (n operands, splicer)
>> +exp_ops["AND"] = (2, splice_bitwise_and)
>> +exp_ops["UGTE"] = (2, splice_ugte)
>> +exp_ops["ULT"] = (2, splice_ult)
>> +exp_ops["&&"] = (2, splice_logical_and)
>> +
>> +
>> +hw_vars = {}
>> +hw_vars["$EuCoresTotalCount"] = "brw->perfquery.sys_vars.n_eus"
>> +hw_vars["$EuSlicesTotalCount"] = "brw->perfquery.sys_vars.n_eu_slices"
>> +hw_vars["$EuSubslicesTotalCount"] =
>> "brw->perfquery.sys_vars.n_eu_sub_slices"
>> +hw_vars["$EuThreadsCount"] = "brw->perfquery.sys_vars.eu_threads_count"
>> +hw_vars["$SliceMask"] = "brw->perfquery.sys_vars.slice_mask"
>> +hw_vars["$SubsliceMask"] = "brw->perfquery.sys_vars.subslice_mask"
>> +hw_vars["$GpuTimestampFrequency"] =
>> "brw->perfquery.sys_vars.timestamp_frequency"
>> +hw_vars["$GpuMinFrequency"] = "brw->perfquery.sys_vars.gt_min_freq"
>> +hw_vars["$GpuMaxFrequency"] = "brw->perfquery.sys_vars.gt_max_freq"
>> +
>> +counter_vars = {}
>
>
> I guess you can remove this counter_vars as it's given as an argument to
the
> following functions.
>
>
>> +
>> +def output_rpn_equation_code(set, counter, equation, counter_vars):
>> + c("/* RPN equation: " + equation + " */")
>> + tokens = equation.split()
>> + stack = []
>> + tmp_id = 0
>> + tmp = None
>> +
>> + for token in tokens:
>> + stack.append(token)
>> + while stack and stack[-1] in ops:
>> + op = stack.pop()
>> + argc, callback = ops[op]
>> + args = []
>> + for i in range(0, argc):
>> + operand = stack.pop()
>> + if operand[0] == "$":
>> + if operand in hw_vars:
>> + operand = hw_vars[operand]
>> + elif operand in counter_vars:
>> + reference = counter_vars[operand]
>> + operand = read_funcs[operand[1:]] + "(brw, query,
>> accumulator)"
>> + else:
>> + raise Exception("Failed to resolve variable " +
>> operand + " in equation " + equation + " for " + set.get('name') + " ::
" +
>> counter.get('name'));
>> + args.append(operand)
>> +
>> + tmp_id = callback(tmp_id, args)
>> +
>> + tmp = "tmp" + str(tmp_id - 1)
>> + stack.append(tmp)
>> +
>> + if len(stack) != 1:
>> + raise Exception("Spurious empty rpn code for " + set.get('name')
>> + " :: " +
>> + counter.get('name') + ".\nThis is probably due to some
>> unhandled RPN function, in the equation \"" +
>> + equation + "\"")
>> +
>> + value = stack.pop()
>> +
>> + if value in hw_vars:
>> + value = hw_vars[value];
>> +
>> + c("\nreturn " + value + ";")
>> +
>> +def splice_rpn_expression(set, counter, expression):
>> + tokens = expression.split()
>> + stack = []
>> +
>> + for token in tokens:
>> + stack.append(token)
>> + while stack and stack[-1] in exp_ops:
>> + op = stack.pop()
>> + argc, callback = exp_ops[op]
>> + args = []
>> + for i in range(0, argc):
>> + operand = stack.pop()
>> + if operand[0] == "$":
>> + if operand in hw_vars:
>> + operand = hw_vars[operand]
>> + else:
>> + raise Exception("Failed to resolve variable " +
>> operand + " in expression " + expression + " for " + set.get('name') + "
::
>> " + counter.get('name'));
>> + args.append(operand)
>> +
>> + subexp = callback(args)
>> +
>> + stack.append(subexp)
>> +
>> + if len(stack) != 1:
>> + raise Exception("Spurious empty rpn expression for " +
>> set.get('name') + " :: " +
>> + counter.get('name') + ".\nThis is probably due to some
>> unhandled RPN operation, in the expression \"" +
>> + expression + "\"")
>> +
>> + return stack.pop()
>> +
>> +def output_counter_read(set, counter, counter_vars):
>> + c("\n")
>> + c("/* " + set.get('name') + " :: " + counter.get('name') + " */")
>> + ret_type = counter.get('data_type')
>> + if ret_type == "uint64":
>> + ret_type = "uint64_t"
>> +
>> + c("static " + ret_type)
>> + read_sym = set.get('chipset').lower() + "__" +
>> set.get('underscore_name') + "__" + counter.get('underscore_name') +
>> "__read"
>> + c(read_sym + "(struct brw_context *brw,\n")
>> + c_indent(len(read_sym) + 1)
>> + c("const struct brw_perf_query_info *query,\n")
>> + c("uint64_t *accumulator)\n")
>> + c_outdent(len(read_sym) + 1)
>> +
>> + c("{")
>> + c_indent(3)
>> +
>> + output_rpn_equation_code(set, counter, counter.get('equation'),
>> counter_vars)
>> +
>> + c_outdent(3)
>> + c("}")
>> +
>> + return read_sym
>> +
>> +def output_counter_max(set, counter, counter_vars):
>> + max_eq = counter.get('max_equation')
>> +
>> + if not max_eq:
>> + return "0; /* undefined */"
>> +
>> + try:
>> + val = float(max_eq)
>> + return max_eq + ";"
>> + except:
>> + pass
>> +
>> + # We can only report constant maximum values via
>> INTEL_performance_query
>> + for token in max_eq.split():
>> + if token[0] == '$' and token not in hw_vars:
>> + return "0; /* unsupported (varies over time) */"
>> +
>> + c("\n")
>> + c("/* " + set.get('name') + " :: " + counter.get('name') + " */")
>> + ret_type = counter.get('data_type')
>> + if ret_type == "uint64":
>> + ret_type = "uint64_t"
>> +
>> + c("static " + ret_type)
>> + max_sym = set.get('chipset').lower() + "__" +
>> set.get('underscore_name') + "__" + counter.get('underscore_name') +
"__max"
>> + c(max_sym + "(struct brw_context *brw)\n")
>> +
>> + c("{")
>> + c_indent(3)
>> +
>> + output_rpn_equation_code(set, counter, max_eq, counter_vars)
>> +
>> + c_outdent(3)
>> + c("}")
>> +
>> + return max_sym + "(brw);"
>> +
>> +c_type_sizes = { "uint32_t": 4, "uint64_t": 8, "float": 4, "double": 8,
>> "bool": 4 }
>> +def sizeof(c_type):
>> + return c_type_sizes[c_type]
>> +
>> +def pot_align(base, pot_alignment):
>> + return (base + pot_alignment - 1) & ~(pot_alignment - 1);
>> +
>> +semantic_type_map = {
>> + "duration": "raw",
>> + "ratio": "event"
>> + }
>> +
>> +def output_counter_report(set, counter, current_offset):
>> + data_type = counter.get('data_type')
>> + data_type_uc = data_type.upper()
>> + c_type = data_type
>> +
>> + if "uint" in c_type:
>> + c_type = c_type + "_t"
>> +
>> + semantic_type = counter.get('semantic_type')
>> + if semantic_type in semantic_type_map:
>> + semantic_type = semantic_type_map[semantic_type]
>> +
>> + semantic_type_uc = semantic_type.upper()
>> +
>> + c("\n")
>> +
>> + availability = counter.get('availability')
>> + if availability:
>> + expression = splice_rpn_expression(set, counter, availability)
>> + lines = expression.split(' && ')
>> + n_lines = len(lines)
>> + if n_lines == 1:
>> + c("if (" + lines[0] + ") {")
>> + else:
>> + c("if (" + lines[0] + " &&")
>> + c_indent(4)
>> + for i in range(1, (n_lines - 1)):
>> + c(lines[i] + " &&")
>> + c(lines[(n_lines - 1)] + ") {")
>> + c_outdent(4)
>> + c_indent(3)
>> +
>> + c("counter = &query->counters[query->n_counters++];\n")
>> + c("counter->oa_counter_read_" + data_type + " = " +
>> read_funcs[counter.get('symbol_name')] + ";\n")
>> + c("counter->name = \"" + counter.get('name') + "\";\n")
>> + c("counter->desc = \"" + counter.get('description') + "\";\n")
>> + c("counter->type = GL_PERFQUERY_COUNTER_" + semantic_type_uc +
>> "_INTEL;\n")
>> + c("counter->data_type = GL_PERFQUERY_COUNTER_DATA_" + data_type_uc +
>> "_INTEL;\n")
>> + c("counter->raw_max = " + max_values[counter.get('symbol_name')] +
>> "\n")
>> +
>> + current_offset = pot_align(current_offset, sizeof(c_type))
>> + c("counter->offset = " + str(current_offset) + ";\n")
>> + c("counter->size = sizeof(" + c_type + ");\n")
>> +
>> + if availability:
>> + c_outdent(3);
>> + c("}")
>> +
>> + return current_offset + sizeof(c_type)
>> +
>> +parser = argparse.ArgumentParser()
>> +parser.add_argument("xml", help="XML description of metrics")
>> +parser.add_argument("--header", help="Header file to write")
>> +parser.add_argument("--code", help="C file to write")
>> +parser.add_argument("--chipset", help="Chipset to generate code for")
>> +
>> +args = parser.parse_args()
>> +
>> +chipset = args.chipset.lower()
>> +
>> +if args.header:
>> + header_file = open(args.header, 'w')
>> +
>> +if args.code:
>> + c_file = open(args.code, 'w')
>> +
>> +tree = ET.parse(args.xml)
>> +
>> +
>> +copyright = """/* Autogenerated file, DO NOT EDIT manually!
>> + *
>> + * Copyright (c) 2015 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining
>> a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the
>> next
>> + * paragraph) shall be included in all copies or substantial portions of
>> the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
>> SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> + * DEALINGS IN THE SOFTWARE.
>> + */
>> +
>> +"""
>> +
>> +h(copyright)
>> +h("""#pragma once
>> +
>> +struct brw_context;
>> +
>> +""")
>> +
>> +c(copyright)
>> +c(
>> +"""
>> +#include <stdint.h>
>> +#include <stdbool.h>
>> +
>> +#include "util/hash_table.h"
>> +
>> +""")
>> +
>> +c("#include \"brw_oa_" + chipset + ".h\"")
>> +
>> +c(
>> +"""
>> +#include "brw_context.h"
>> +#include "brw_performance_query.h"
>> +
>> +
>> +#define MIN(a, b) ((a < b) ? (a) : (b))
>> +#define MAX(a, b) ((a > b) ? (a) : (b))
>> +
>> +""")
>> +
>> +for set in tree.findall(".//set"):
>> + max_values = {}
>> + read_funcs = {}
>> + counter_vars = {}
>> + counters = set.findall("counter")
>> +
>> + assert set.get('chipset').lower() == chipset
>> +
>> + for counter in counters:
>> + empty_vars = {}
>> + read_funcs[counter.get('symbol_name')] = output_counter_read(set,
>> counter, counter_vars)
>> + max_values[counter.get('symbol_name')] = output_counter_max(set,
>> counter, empty_vars)
>> + counter_vars["$" + counter.get('symbol_name')] = counter
>> +
>> +
>> + c("\nstatic struct brw_perf_query_counter " + chipset + "_" +
>> set.get('underscore_name') + "_query_counters[" + str(len(counters)) +
>> "];\n")
>> + c("static struct brw_perf_query_info " + chipset + "_" +
>> set.get('underscore_name') + "_query = {\n")
>> + c_indent(3)
>> +
>> + c(".kind = OA_COUNTERS,\n")
>> + c(".name = \"" + set.get('name') + "\",\n")
>> + c(".guid = \"" + set.get('hw_config_guid') + "\",\n")
>> +
>> + c(".counters = " + chipset + "_" + set.get('underscore_name') +
>> "_query_counters,")
>> + c(".n_counters = 0,")
>> + c(".oa_metrics_set_id = 0, /* determined at runtime, via sysfs */")
>> +
>> + if chipset == "hsw":
>> + c(""".oa_format = I915_OA_FORMAT_A45_B8_C8,
>> +
>> +/* Accumulation buffer offsets... */
>> +.gpu_time_offset = 0,
>> +.a_offset = 1,
>> +.b_offset = 46,
>> +.c_offset = 54,
>> +""")
>> + else:
>> + c(""".oa_format = I915_OA_FORMAT_A32u40_A4u32_B8_C8,
>> +
>> +/* Accumulation buffer offsets... */
>> +.gpu_time_offset = 0,
>> +.gpu_clock_offset = 1,
>> +.a_offset = 2,
>> +.b_offset = 38,
>> +.c_offset = 46,
>> +""")
>> +
>> + c_outdent(3)
>> + c("};\n")
>> +
>> + c("\nstatic void\n")
>> + c("register_" + set.get('underscore_name') + "_counter_query(struct
>> brw_context *brw)\n")
>> + c("{\n")
>> + c_indent(3)
>> +
>> + c("static struct brw_perf_query_info *query = &" + chipset + "_" +
>> set.get('underscore_name') + "_query;\n")
>> + c("struct brw_perf_query_counter *counter;\n")
>> +
>> + c("\n")
>> + c("/* Note: we're assuming there can't be any variation in the
>> definition ")
>> + c(" * of a query between contexts so it's ok to describe a query
>> within a ")
>> + c(" * global variable which only needs to be initialized once... */")
>> + c("\nif (!query->data_size) {")
>> + c_indent(3)
>> +
>> + offset = 0
>> + for counter in counters:
>> + offset = output_counter_report(set, counter, offset)
>> +
>> +
>> + c("\nquery->data_size = counter->offset + counter->size;\n")
>> +
>> + c_outdent(3)
>> + c("}");
>> +
>> + c("\n_mesa_hash_table_insert(brw->perfquery.oa_metrics_table,
>> query->guid, query);")
>> +
>> + c_outdent(3)
>> + c("}\n")
>> +
>> +h("void brw_oa_register_queries_" + chipset + "(struct brw_context
>> *brw);\n")
>> +
>> +c("\nvoid")
>> +c("brw_oa_register_queries_" + chipset + "(struct brw_context *brw)")
>> +c("{")
>> +c_indent(3)
>> +
>> +for set in tree.findall(".//set"):
>> + c("register_" + set.get('underscore_name') + "_counter_query(brw);")
>> +
>> +c_outdent(3)
>> +c("}")
>> +
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170228/10dde1ed/attachment-0001.html>