[Mesa-dev] [PATCH] squash: Fix up VPM read optimization.

Rhys Kidd rhyskidd at gmail.com
Fri Mar 11 02:32:38 UTC 2016


On 10 March 2016 at 15:21, Eric Anholt <eric at anholt.net> wrote:

> - There's no reason there would be only 64 operations that read from the
>   output of a mov from VPM, so we might smash the stack (fixes etqw trace)
>
> - Fixes segfault where we assumed that a single-use temp had a def (fixes
>   2 piglit tests)
>
> - We need to only mark progress when we actually did the optimization, or
>   we'll infinite loop (0ad trace).
>
> - Misc style fixes.
>
> - No reordering sampler instructions (fixes a glean test)
>
> shader-db results:
> total instructions in shared programs: 78513 -> 78071 (-0.56%)
> instructions in affected programs:     10406 -> 9964 (-4.25%)
> total estimated cycles in shared programs: 234674 -> 234274 (-0.17%)
> estimated cycles in affected programs:     35188 -> 34788 (-1.14%)
> ---
>
> Varad, here's what I came up with trying to test your patch.  If these
> changes look good to you, I can squash them in and push.
>
>  src/gallium/drivers/vc4/vc4_opt_vpm.c | 45
> +++++++++++++++++------------------
>  1 file changed, 22 insertions(+), 23 deletions(-)
>
> diff --git a/src/gallium/drivers/vc4/vc4_opt_vpm.c
> b/src/gallium/drivers/vc4/vc4_opt_vpm.c
> index 277b345..a4ee6af 100644
> --- a/src/gallium/drivers/vc4/vc4_opt_vpm.c
> +++ b/src/gallium/drivers/vc4/vc4_opt_vpm.c
> @@ -40,10 +40,8 @@ qir_opt_vpm(struct vc4_compile *c)
>
>          bool progress = false;
>          struct qinst *vpm_writes[64] = { 0 };
> -        struct qinst *vpm_reads[64] = { 0 };
>          uint32_t use_count[c->num_temps];
>          uint32_t vpm_write_count = 0;
> -        uint32_t vpm_read_count = 0;
>          memset(&use_count, 0, sizeof(use_count));
>
>          list_for_each_entry(struct qinst, inst, &c->instructions, link) {
> @@ -59,24 +57,14 @@ qir_opt_vpm(struct vc4_compile *c)
>                          if (inst->src[i].file == QFILE_TEMP) {
>                                  uint32_t temp = inst->src[i].index;
>                                  use_count[temp]++;
> -
> -                                struct qinst *mov = c->defs[temp];
> -                                if (!mov ||
> -                                    (mov->op != QOP_MOV &&
> -                                    mov->op != QOP_FMOV &&
> -                                    mov->op != QOP_MMOV)) {
> -                                        continue;
> -                                }
> -
> -                                if (mov->src[0].file == QFILE_VPM)
> -                                        vpm_reads[vpm_read_count++] =
> inst;
>                          }
>                  }
>          }
>
> -        for (int i = 0; i < vpm_read_count; i++) {
> -                struct qinst *inst = vpm_reads[i];
> -
> +        /* For instructions reading from a temporary that contains a VPM
> read
> +         * result, try to move the instruction up in place of the VPM
> read.
> +         */
> +        list_for_each_entry(struct qinst, inst, &c->instructions, link) {
>                  if (!inst || qir_is_multi_instruction(inst))
>                          continue;
>
> @@ -84,21 +72,32 @@ qir_opt_vpm(struct vc4_compile *c)
>                          continue;
>
>                  if (qir_has_side_effects(c, inst) ||
> -                    qir_has_side_effect_reads(c, inst))
> +                    qir_has_side_effect_reads(c, inst) ||
> +                    qir_is_tex(inst))
>                          continue;
>
>                  for (int j = 0; j < qir_get_op_nsrc(inst->op); j++) {
> -                        if(inst->src[j].file != QFILE_TEMP)
> +                        if (inst->src[j].file != QFILE_TEMP)
>                                  continue;
>
>                          uint32_t temp = inst->src[j].index;
> +
> +                        /* Since VPM reads pull from a FIFO, we only get
> to
> +                         * read each VPM entry once (unless we reset the
> read
> +                         * pointer).  That means we can't copy-propagate
> a VPM
> +                         * read to multiple locations.
> +                         */
>                          if (use_count[temp] != 1)
>                                  continue;
>
>                          struct qinst *mov = c->defs[temp];
> -
> -                        if (mov->src[0].file != QFILE_VPM)
> +                        if (!mov ||
> +                            (mov->op != QOP_MOV &&
> +                             mov->op != QOP_FMOV &&
> +                             mov->op != QOP_MMOV) ||
> +                            mov->src[0].file != QFILE_VPM) {
>                                  continue;
> +                        }
>
>                          uint32_t temps = 0;
>                          for (int k = 0; k < qir_get_op_nsrc(inst->op);
> k++) {
> @@ -109,15 +108,15 @@ qir_opt_vpm(struct vc4_compile *c)
>                          /* The instruction is safe to reorder if its other
>                           * sources are independent of previous
> instructions
>                           */
> -                        if (temps == 1 ) {
> +                        if (temps == 1) {
>                                  list_del(&inst->link);
>                                  inst->src[j] = mov->src[0];
>                                  list_replace(&mov->link, &inst->link);
>                                  c->defs[temp] = NULL;
>                                  free(mov);
> +                                progress = true;
> +                                break;
>                          }
> -
> -                        progress = true;
>                  }
>          }
>
> --
> 2.7.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

Varad, Eric,

I tested the set of patches [0], [1] plus this one for piglit regressions
on RPi2 hw.

Two regressions caused by [0] and [1] remain:
- spec@!opengl 1.5 at draw-vertices
- spec@!opengl 1.5 at draw-vertices-user

[0] https://patchwork.freedesktop.org/patch/76069/
[1] https://patchwork.freedesktop.org/patch/76070/

Regards,
Rhys
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160310/e1886f1d/attachment-0001.html>


More information about the mesa-dev mailing list