[Mesa-dev] RFC uniform packing for gallium V2

Timothy Arceri tarceri at itsqueeze.com
Sun Jun 25 08:01:21 UTC 2017


On Sun, Jun 25, 2017, at 12:18 AM, Nicolai Hähnle wrote:
> On 25.06.2017 03:31, Timothy Arceri wrote:
> > There are still a handful of piglit tests failing and I'm yet to test
> > that there are no regressions in the non-packed path, but I'd really
> > like some feedback on the approach as Dave has flagged it as a possible
> > controversial tgsi change.
> > 
> > In order to avoid complicated swizzling and array element adjustments
> > when dealing with arrays, this series simply adjusts the constant buffer
> > index to point to the right location. There are some small changes to
> > deal with indirect indexing but these also remain very simple and easy
> > to follow.
> >
> > Dave has raised concerns that others might not like this as it doesn't
> > strictly follow the tgsi approach that everything is a vec4. I would
> > argue however that this is by far the simplest approch.
> > Doing this with swizzles and array adjustments is going to require
> > something like lower_packed_varyings.cpp which would be unnecessarily
> > complicated IMO, I started off down that track and soon changed
> > direction.
> 
> Yeah, I don't like the approach either. All register files are by vec4 
> in TGSI, and changing that feels pretty wrong.

Sorry, can I ask if you looked at the patches? The change is fairly
limited, we just change the index so that it points to the exact buffer
location rather than something that needs to be multiplied by 4 later
on. 

> 
> I would suggest lowering loads from CONST[0] to LOAD instructions, in 
> the same way that is used for SSBOs. This has the additional advantage 
> that we could then use the same code paths to support std430 packing for 
> UBOs (via a GL extension, I suppose).

It not really that simple. For example if you have an array you could
end up having to create something like CONST[idx - 2].yz to get what you
want, you also need to handle structs etc. If we must go this way I'd
say this task will move to the very end of my TODO list :P

> 
> 
> > The main goal of this series is to reduce the cpu overhead cause by
> > _mesa_propagate_uniforms_to_driver_storage(). The function is slow since we
> > need to deal with strides etc because we are copying packed data to an
> > unpacked destination. It's also copying data that we have only just copied
> > to anouther duplicate uniform storage that gets created by the linker.
> 
> The duplicate copy is necessary unless we start using the same constant 
> buffer for all shaders in a program, which actually might not be such a 
> bad idea.

I think we are talking about different things. I'm not talking about the
copies in the driver, I'm talking about [1] followed by [2]. There is no
need to do both. 

[1]
https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/uniform_query.cpp#n1081
[2]
https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/uniform_query.cpp#n1107

> 
> Cheers,
> Nicolai
> 
> 
> > This series fixes both of these issues and also reduces the size of the
> > drivers const buffer as a side effect.
> > 
> > Patches 2-3 just rework the way we use the param list.
> > 
> > The remaining add the packing support enabled by the
> > PackedDriverUniformStorage const.
> > 
> > You can get the series in my test4 branch [1].
> > 
> > [1] https://github.com/tarceri/Mesa.git
> > 
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> 
> 
> -- 
> Lerne, wie die Welt wirklich ist,
> Aber vergiss niemals, wie sie sein sollte.


More information about the mesa-dev mailing list