[Mesa-dev] [PATCH v2] nir: Handle variables dependent on the local work group size.

Manolova, Plamena plamena.manolova at intel.com
Mon Nov 12 14:25:13 UTC 2018


Hi Jason,
Thank you so much for reviewing! In my initial series for
ARB_compute_variable_group_size
(https://patchwork.freedesktop.org/patch/228130) from which this is
extracted, I moved
lowering these variables to brw_nir_lower_cs_intrinsics and did what you're
suggesting i.e.
I used the load_local_group_size intrinsic. My reasoning was that this
might give other drivers
some flexibility on how they can handle this, but if it makes more sense to
keep this in
nir_lower_system_values I'm happy to rework my patch.

Thank you,
Pam

On Mon, Nov 12, 2018 at 5:07 AM Jason Ekstrand <jason at jlekstrand.net> wrote:

> On Sun, Nov 11, 2018 at 8:46 PM Karol Herbst <kherbst at redhat.com> wrote:
>
>>  an)
>> On Mon, Nov 12, 2018 at 12:37 AM Jason Ekstrand <jason at jlekstrand.net>
>> wrote:
>> >
>> > On November 11, 2018 16:36:16 Karol Herbst <kherbst at redhat.com> wrote:
>> >
>> > > On Sun, Nov 11, 2018 at 10:48 PM Jason Ekstrand <jason at jlekstrand.net>
>> wrote:
>> > >>
>> > >> On Sun, Nov 11, 2018 at 3:35 PM Plamena Manolova
>> > >> <plamena.n.manolova at gmail.com> wrote:
>> > >>>
>> > >>>
>> > >>> Lowering shader variables which depend on the local work group
>> > >>> size being available in nir_lower_system_values is only possible
>> > >>> if the local work group size isn't variable, otherwise this should
>> > >>> be handled by the native driver (if it supports
>> > >>> ARB_compute_variable_group_size).
>> > >>>
>> > >>>
>> > >>> Signed-off-by: Plamena Manolova <plamena.n.manolova at gmail.com>
>> > >>> ---
>> > >>> src/compiler/nir/nir_lower_system_values.c | 22
>> ++++++++++++++++++++++
>> > >>> 1 file changed, 22 insertions(+)
>> > >>>
>> > >>>
>> > >>> diff --git a/src/compiler/nir/nir_lower_system_values.c
>> > >>> b/src/compiler/nir/nir_lower_system_values.c
>> > >>> index bde7eb1180..6fdf9f9c51 100644
>> > >>> --- a/src/compiler/nir/nir_lower_system_values.c
>> > >>> +++ b/src/compiler/nir/nir_lower_system_values.c
>> > >>> @@ -77,6 +77,15 @@ convert_block(nir_block *block, nir_builder *b)
>> > >>>    *    "The value of gl_GlobalInvocationID is equal to
>> > >>>    *    gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID"
>> > >>>    */
>> > >>> +
>> > >>> +         /*
>> > >>> +          * If the local work group size is variable we can't
>> lower the global
>> > >>> +          * invocation id here.
>> > >>
>> > >>
>> > >> Why not?
>> > >>
>> > >>>
>> > >>>
>> > >>> +          */
>> > >>> +         if (b->shader->info.cs.local_size_variable) {
>> > >>
>> > >>
>> > >> I didn't realize we'd added a bit for this.  At one point in time,
>> Karol
>> > >> had patches which had it keyed off of the size being zero.  Having a
>> > >> separate bit is probably fine, it just surpised me.
>> > >
>> > >
>> > > yeah.. I guess I choose that, because I had nothing better. But I
>> > > guess having the size being 0 is good enough as long as we sure it is
>> > > 0 in relevant cases.
>> >
>> > It's fine. I just thought we'd chosen a different convention bit there's
>> > nothing wrong with it. I was just surprised. That's all.
>> >
>> > >
>> > >>>
>> > >>>
>> > >>> +            break;
>> > >>> +         }
>> > >>> +
>> > >>>   nir_ssa_def *group_size = build_local_group_size(b);
>> > >>>   nir_ssa_def *group_id = nir_load_work_group_id(b);
>> > >>>   nir_ssa_def *local_id = nir_load_local_invocation_id(b);
>> > >>> @@ -115,6 +124,11 @@ convert_block(nir_block *block, nir_builder *b)
>> > >>> }
>> > >>>
>> > >>>
>> > >>> case SYSTEM_VALUE_LOCAL_GROUP_SIZE: {
>> > >>> +         /* If the local work group size is variable we can't
>> lower it here */
>> > >>> +         if (b->shader->info.cs.local_size_variable) {
>> > >>> +            break;
>> > >>> +         }
>> > >>> +
>> > >>>   sysval = build_local_group_size(b);
>> > >>>   break;
>> > >>> }
>> > >>> @@ -189,6 +203,14 @@ convert_block(nir_block *block, nir_builder *b)
>> > >>>   break;
>> > >>>
>> > >>>
>> > >>> case SYSTEM_VALUE_GLOBAL_GROUP_SIZE: {
>> > >>> +         /*
>> > >>> +          * If the local work group size is variable we can't
>> lower the global
>> > >>> +          * group size here.
>> > >>> +          */
>> > >>> +         if (b->shader->info.cs.local_size_variable) {
>> > >>> +            break;
>> > >>> +         }
>> > >>
>> > >>
>> > >> Why can't we lower the global size?  It seems like we would want the
>> below
>> > >> multiplication regardless of whether the local size is constant.
>> > >
>> > > well I am not sure about ARB_compute_variable_group_size, but at least
>> > > in CL you know nothing about it at compile time as you specify
>> > > everything when you enqueue the kernel. Could be that the number of
>> > > work_groups is fixed with ARB_compute_variable_group_size though
>> >
>> > Why can't you multiply two non-constant things together? Maybe the
>> driver
>> > can do something now efficient but it's not clear to me that this isn't
>> > what we want.
>> >
>>
>> oh, you meant it that way. Mhh, I actually wrote the patch adding
>> build_local_group_size to handle that inside that function (lower to
>> constants if known, intrinsics otherwise) as we need the local_size
>> for three different system values. I have a patch for that for CL, but
>> the patch itself is a bit hacky right now (as in, it doesn't handle it
>> inside build_local_group_size).
>>
>> And I think that patch misses handling for
>> SYSTEM_VALUE_GLOBAL_GROUP_SIZE anyhow.
>>
>> For CL we even want to add a global_invocation_id_offset system value
>> and add it when handling SYSTEM_VALUE_GLOBAL_INVOCATION_ID, but that
>> we can ignore for now.
>>
>> Anyhow, I think it would make sense to move the check into
>> build_local_group_sized and either return a constant value or an
>> intrinsic loading the value from the driver. I just remember vaguely
>> that I think somebody brought that up months ago and there was a good
>> reason not to do that for every driver?
>>
>
> That's what I was thinking.  We don't want to lower load_local_group_size
> but then we want build_local_group_size to either build a constant or a
> load_local_group_size intrinsic as you said above.
>
> --Jason
>
>
>> > >
>> > >>>
>> > >>>
>> > >>> +
>> > >>>   nir_ssa_def *group_size = build_local_group_size(b);
>> > >>>   nir_ssa_def *num_work_groups = nir_load_num_work_groups(b);
>> > >>>   sysval = nir_imul(b, group_size, num_work_groups);
>> > >>> --
>> > >>> 2.11.0
>> > >>>
>> > >>>
>> > >>> _______________________________________________
>> > >>> mesa-dev mailing list
>> > >>> mesa-dev at lists.freedesktop.org
>> > >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> >
>> >
>> >
>>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20181112/b66fbefc/attachment.html>


More information about the mesa-dev mailing list