[Mesa-dev] [PATCH] r600g: order atom emission v2

Dave Airlie airlied at gmail.com
Thu Sep 6 17:32:18 PDT 2012


On Fri, Sep 7, 2012 at 10:03 AM, Marek Olšák <maraeo at gmail.com> wrote:
> On Fri, Sep 7, 2012 at 12:05 AM, Jerome Glisse <j.glisse at gmail.com> wrote:
>> On Thu, Sep 6, 2012 at 4:10 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>> On Thu, Sep 6, 2012 at 8:34 PM, Jerome Glisse <j.glisse at gmail.com> wrote:
>>>> On Thu, Sep 6, 2012 at 2:29 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>> This looks good to me. It's funny to see the r300g architecture being
>>>>> re-implemented in r600g. :)
>>>>>
>>>>> There's one optimization that r300g has that this patch doesn't. r300g
>>>>> keeps the index of the first and the last dirty atom and the loops
>>>>> over the list of atoms look like this:
>>>>> for (i = first_dirty; i <= last_dirty; i++)
>>>>>
>>>>> And after emission:
>>>>> first_dirty = some large number;
>>>>> last_dirty= 0;
>>>>>
>>>>> The atoms should be ordered according to how frequently they are
>>>>> updated (except when the ordering is required by the hw). But most
>>>>> importantly, if there are no state changes, the loops are trivially
>>>>> skipped.
>>>>>
>>>>> Marek
>>>>
>>>> Don't think this optimization is worth it, there won't be much more
>>>> than 32 atom in the end and it definitely can't be ordered from most
>>>> frequent to less frequent as some of the stuff need to be at the last
>>>> being emitted and they are frequent one (primitive type for instance).
>>>
>>> I didn't say all atoms *must* be sorted. I meant that some (most?)
>>> atoms can be sorted, i.e. you can have some atoms at fixed positions
>>> (like the primitype type or the seamless cubemap state), but you have
>>> always at least *some* freedom where you put the rest. The ordering I
>>> had in mind was actually from the least frequent to the most frequent,
>>> in other words, from the framebuffer (least frequent) to shaders to
>>> textures to constant buffers to vertex buffers (most frequent).
>>>
>>> Of course, the code should document which atoms must have fixed
>>> positions along with an explanation. The comment that all atom
>>> positions must not be changed isn't enough, because it's not true.
>>>
>>> Marek
>>
>> I won't try to find which atom can have complete floating position, i
>> am just grouping together register that are always emitted together in
>> fglrx and then i position this group relative to each other according
>> to fglrx position. That means all atom are always emitted in a
>> specific order. So there won't be any freedom. The only freedom i can
>> think of is btw 2 position forced atom and that make the sorting
>> completely useless and complicated.
>
> I'll add the optimization anyway (without sorting). Draw operations
> without state changes or with only one state update are quite common.
>
> Anyway, it was said in the v1 thread that the hardware doesn't need
> any specific ordering for proper functioning. While it may be
> beneficial to emit one or two registers earlier than the others,
> insisting on fixed ordering of all of them is not only limiting, it
> seems useless and waste of time as well. What I don't understand: Why
> do you blindly copy everything fglrx *seems* to be doing without any
> real reason? It does not fix any bug, it does not improve performance,
> it does not clean up the code... so why? I am all ears.

At the very least, please document a list of lockups this avoids. Less
magic more text.

Dave.


More information about the mesa-dev mailing list