[Mesa-dev] [PATCH v5 4/6] mesa/st: glsl_to_tgsi: add register renamame mapping evaluator
Nicolai Hähnle
nhaehnle at gmail.com
Mon Jun 26 13:24:26 UTC 2017
On 25.06.2017 09:22, Gert Wollny wrote:
> The remapping evaluator first sorts the temporary registers ascending
> based on their first life time instruction, and then uses a binary search
> to find merge canidates.
> For the initial sorting it uses std::sort because qsort is quite slow in
> comparison. By removing the define USE_STL_SORT in
> src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
> one can enable the alternative code path that uses qsort.
> ---
> .../state_tracker/st_glsl_to_tgsi_temprename.cpp | 124 +++++++++++++++++++++
> .../state_tracker/st_glsl_to_tgsi_temprename.h | 3 +
> 2 files changed, 127 insertions(+)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
> index 729d77130e..d52d912951 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
> @@ -27,6 +27,12 @@
> #include <mesa/program/prog_instruction.h>
> #include <limits>
>
> +/* std::sort is significanter than qsort */
Something is missing in this comment :)
> +#define USE_STL_SORT
> +#ifdef USE_STL_SORT
> +#include <algorithm>
> +#endif
> +
> /* Without c++11 define the nullptr for forward-compatibility
> * and better readibility */
> #if __cplusplus < 201103L
> @@ -660,3 +666,121 @@ prog_scope_storage::create(prog_scope *p, e_scope_type type, int id,
> storage[current_slot] = prog_scope(p, type, id, lvl, s_begin);
> return &storage[current_slot++];
> }
> +
> +/* helper class for sorting and searching the registers based
> + * on life times. */
Closing */ on its own line, and proper capitalization, please.
> +struct access_record {
> + int begin;
> + int end;
> + int reg;
> + bool erase;
> +
> + bool operator < (const access_record& rhs) {
const?
> + return begin < rhs.begin;
> + }
> +};
> +
> +/* Find the next register between [start, end) that has a life time starting
> + * at or after bound by using a binary search.
> + * start points at the beginning of the search range,
> + * end points at the element past the end of the search range, and
> + * the array comprising [start, end) must be sorted in ascending order.
> + */
> +access_record*
> +find_next_rename(access_record* start, access_record* end, int bound)
Function should be static.
> +{
> + int delta = (end - start);
> +
> + while (delta > 0) {
> +
Again with the whitespace issue. Please also fix other occurrences.
> + int half = delta >> 1;
> + access_record* middle = start + half;
> +
> + if (bound <= middle->begin) {
> + delta = half;
> + } else {
> + start = middle;
> + ++start;
> + delta -= half + 1;
> + }
> + }
> +
> + return start;
> +}
> +
> +#ifndef USE_STL_SORT
Function should be static.
> +int access_record_compare (const void *a, const void *b) { > + const access_record *aa = static_cast<const access_record*>(a);
> + const access_record *bb = static_cast<const access_record*>(b);
> + return aa->begin < bb->begin ? -1 : (aa->begin > bb->begin ? 1 : 0);
> +}
> +#endif
> +
> +/* This functions evaluates the register merges by using an O(n log n)
> + * algorithm to find suitable merge candidates. */
> +void get_temp_registers_remapping(void *mem_ctx, int ntemps,
> + const struct lifetime* lifetimes,
> + struct rename_reg_pair *result)
> +{
> + access_record *m = ralloc_array(mem_ctx, access_record, ntemps - 1);
m is not a very descriptive name.
Why ntemps - 1?
> +
> + for (int i = 1; i < ntemps; ++i) {
> + m[i-1].begin = lifetimes[i].begin;
> + m[i-1].end = lifetimes[i].end;
> + m[i-1].reg = i;
> + m[i-1].erase = false;
> + }
> +
> +#ifdef USE_STL_SORT
> + std::sort(m, m + ntemps - 1);
> +#else
> + std::qsort(m, ntemps - 1, sizeof(access_record), access_record_compare);
> +#endif
> +
> + access_record *trgt = m;
> + access_record *mend = m + ntemps - 1;
> + access_record *first_erase = mend;
> + access_record *search_start = trgt + 1;
> +
> + while (trgt != mend) {
> +
> + access_record *src = find_next_rename(search_start, mend, trgt->end);
> +
> + if (src != mend) {
> + result[src->reg].new_reg = trgt->reg;
> + result[src->reg].valid = true;
> + trgt->end = src->end;
> +
> + /* Since we only search forward, don't remove the renamed
> + * register just now, only mark it. */
> + src->erase = true;
> +
> + if (first_erase == mend)
> + first_erase = src;
> +
> + search_start = src + 1;
> + } else {
> + /* Moving to the next target register it is time to remove
> + * the already merged registers from the search range */
> + if (first_erase != mend) {
> +
> + access_record *out = first_erase;
> + access_record *in_start = first_erase + 1;
Why in_start? Better just in. Or maybe even dst and src, that's more
idiomatic.
On a more high-level note, this algorithm isn't actually O(n log n) as
you claimed somewhere. It's true that you improved the search part, but
now the compacting is the asymptotic bottleneck, and it's like still
O(n^2), unless I'm missing something.
Cheers,
Nicolai
> +
> + while (in_start != mend) {
> +
> + if (!in_start->erase)
> + *out++ = *in_start;
> +
> + ++in_start;
> + }
> + mend = out;
> + first_erase = mend;
> + }
> +
> + ++trgt;
> + search_start = trgt + 1;
> + }
> + }
> + ralloc_free(m);
> +}
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
> index a4124b4659..f6a89ed0d3 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
> @@ -31,3 +31,6 @@ struct lifetime {
> void
> get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
> int ntemps, struct lifetime *lifetimes);
> +void get_temp_registers_remapping(void *mem_ctx, int ntemps,
> + const struct lifetime* lifetimes,
> + struct rename_reg_pair *result);
>
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
More information about the mesa-dev
mailing list