[Mesa-dev] [RFC] i965/dbg: Expose cases hitting a presumably dead optimization

Jason Ekstrand jason at jlekstrand.net
Sat Mar 12 16:44:54 UTC 2016


On Mar 11, 2016 11:47 PM, "Pohjolainen, Topi" <topi.pohjolainen at intel.com>
wrote:
>
> On Fri, Mar 11, 2016 at 05:59:37PM -0800, Jason Ekstrand wrote:
> >    On Fri, Mar 11, 2016 at 4:40 AM, Topi Pohjolainen
> >    <[1]topi.pohjolainen at intel.com> wrote:
> >
> >      The logic iterates over param[] which contains pointers to
> >      uniform storage set during linking (see
> >      link_assign_uniform_locations()).
> >      The pointers are unique and it should be impossible to find
> >      matching entries.
> >      I couldn't find any regressions with test system. In addition
> >      I tried several benchmarks on HSW and none hit this.
> >      I'm hoping to remove this optimization attempt. This is the only
> >      bit that depends on knowing about the actual storage during
> >      compilation. All the rest deal with just relative push and pull
> >      locations once the actual filling of pull_param[] is moved
> >      outside the compiler just as param[]. (Filling pull_param is
> >      based on the pull locations and doesn't need to be inside the
> >      compiler).
> >      Any thoughts?
> >
> >    I'm not 100% sure what you're trying to do, but I have a branch that
> >    may be of interest:
> >    [2]
https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/i965-uniform
> >    s
> >    The branch enables support for pushing small uniform arrays.  Among
> >    other things, it redoes the way we do push constants and gets rid of
> >    some of the data tracking in the backend compiler.  The big reason
why
> >    I haven't tried too hard to get it merged is because it regresses
Sandy
> >    Bridge just a bit.  I know I've seen and fixed the bug before in an
> >    alternate attempt, but I don't remember how.
> >    I'm going to be refreshing it soon because we need indirect push
> >    constants for the Vulkan driver.  (The branch is already merged into
> >    the Vulkan branch.)
>
> I'd like to stop filling param[] before compilation. This is really not
> needed by the compiler as it deals with pull and push constant locations,
> i.e., positions in the push and pull files. Actual uniform values and
their
> location in the uniform storage are not needed until actual pipeline
upload.
>
> My plan is to move the iteration over the core uniform storage to pipeline
> upload time. We can fill push and pull buffers directly without the need
of
> storing pointers to param[] in the middle. Not only makes this things
simpler
> and more flexible in my mind, does it give us the possibility to upload
> floats with 16-bit precision instead of 32-bits. Current upload logic only
> gets pointers to 32-bit values without knowing if they should be
represented
> with 16 bits let alone whether the values are floats or integers to begin
> with.

Right. Kristian and I have talked about some related things that we need
for pipeline caching and the Vulkan driver.  In Vulkan, they aren't actual
pointers at all but are, instead, offsets into a push constant block.
Fortunately, the back-end compiler never dereferences them so you can shove
whatever you want in there and it's OK. We've talked about turning the pull
and push params into just a set of integers that means whatever the api and
state setup code want.  One of the problems with pointers is that you can't
easily put them into an on-disk shader cache (which we have for Vulkan).

When you talk about 16 or 64-bit values, what is your intention?  Are
64-bit values still going to take up two slots or are they now one 64-bit
slot?  Are there two 16-bit values per slot or just one?  Are 16-bit
uniforms converted before they get uploaded or consumed directly by the
shader?  I'm still a little confused as to what problem you're trying to
solve.

One thing to think about as you work on this is that Vulkan doesn't have
individual uniforms but instead has a block of explicit push constants.  In
the shader, the push constants are specified with explicit offsets into
that block similar to a UBO.  The result is that it's very difficult for
the state setup code to know what size anything is.  Just chopping the push
constant space into 32-bit hunks that the compiler is free to rearrange is
terribly convenient.  We could use 16 or 8-bit chunks just as easily but
having some be 32-bit, others 64-bit, and others 16-bit has the potential
to get very painful.

Food for thought.  Maybe I'm completely missing what your trying to do.
--Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160312/3aee3427/attachment.html>


More information about the mesa-dev mailing list