<p dir="ltr"><br>
On Mar 11, 2016 11:47 PM, "Pohjolainen, Topi" <<a href="mailto:topi.pohjolainen@intel.com">topi.pohjolainen@intel.com</a>> wrote:<br>
><br>
> On Fri, Mar 11, 2016 at 05:59:37PM -0800, Jason Ekstrand wrote:<br>
> > On Fri, Mar 11, 2016 at 4:40 AM, Topi Pohjolainen<br>
> > <[1]<a href="mailto:topi.pohjolainen@intel.com">topi.pohjolainen@intel.com</a>> wrote:<br>
> ><br>
> > The logic iterates over param[] which contains pointers to<br>
> > uniform storage set during linking (see<br>
> > link_assign_uniform_locations()).<br>
> > The pointers are unique and it should be impossible to find<br>
> > matching entries.<br>
> > I couldn't find any regressions with test system. In addition<br>
> > I tried several benchmarks on HSW and none hit this.<br>
> > I'm hoping to remove this optimization attempt. This is the only<br>
> > bit that depends on knowing about the actual storage during<br>
> > compilation. All the rest deal with just relative push and pull<br>
> > locations once the actual filling of pull_param[] is moved<br>
> > outside the compiler just as param[]. (Filling pull_param is<br>
> > based on the pull locations and doesn't need to be inside the<br>
> > compiler).<br>
> > Any thoughts?<br>
> ><br>
> > I'm not 100% sure what you're trying to do, but I have a branch that<br>
> > may be of interest:<br>
> > [2]<a href="https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/i965-uniform">https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/i965-uniform</a><br>
> > s<br>
> > The branch enables support for pushing small uniform arrays. Among<br>
> > other things, it redoes the way we do push constants and gets rid of<br>
> > some of the data tracking in the backend compiler. The big reason why<br>
> > I haven't tried too hard to get it merged is because it regresses Sandy<br>
> > Bridge just a bit. I know I've seen and fixed the bug before in an<br>
> > alternate attempt, but I don't remember how.<br>
> > I'm going to be refreshing it soon because we need indirect push<br>
> > constants for the Vulkan driver. (The branch is already merged into<br>
> > the Vulkan branch.)<br>
><br>
> I'd like to stop filling param[] before compilation. This is really not<br>
> needed by the compiler as it deals with pull and push constant locations,<br>
> i.e., positions in the push and pull files. Actual uniform values and their<br>
> location in the uniform storage are not needed until actual pipeline upload.<br>
><br>
> My plan is to move the iteration over the core uniform storage to pipeline<br>
> upload time. We can fill push and pull buffers directly without the need of<br>
> storing pointers to param[] in the middle. Not only makes this things simpler<br>
> and more flexible in my mind, does it give us the possibility to upload<br>
> floats with 16-bit precision instead of 32-bits. Current upload logic only<br>
> gets pointers to 32-bit values without knowing if they should be represented<br>
> with 16 bits let alone whether the values are floats or integers to begin<br>
> with.</p>
<p dir="ltr">Right. Kristian and I have talked about some related things that we need for pipeline caching and the Vulkan driver. In Vulkan, they aren't actual pointers at all but are, instead, offsets into a push constant block. Fortunately, the back-end compiler never dereferences them so you can shove whatever you want in there and it's OK. We've talked about turning the pull and push params into just a set of integers that means whatever the api and state setup code want. One of the problems with pointers is that you can't easily put them into an on-disk shader cache (which we have for Vulkan).</p>
<p dir="ltr">When you talk about 16 or 64-bit values, what is your intention? Are 64-bit values still going to take up two slots or are they now one 64-bit slot? Are there two 16-bit values per slot or just one? Are 16-bit uniforms converted before they get uploaded or consumed directly by the shader? I'm still a little confused as to what problem you're trying to solve.</p>
<p dir="ltr">One thing to think about as you work on this is that Vulkan doesn't have individual uniforms but instead has a block of explicit push constants. In the shader, the push constants are specified with explicit offsets into that block similar to a UBO. The result is that it's very difficult for the state setup code to know what size anything is. Just chopping the push constant space into 32-bit hunks that the compiler is free to rearrange is terribly convenient. We could use 16 or 8-bit chunks just as easily but having some be 32-bit, others 64-bit, and others 16-bit has the potential to get very painful.</p>
<p dir="ltr">Food for thought. Maybe I'm completely missing what your trying to do.<br>
--Jason</p>