[Mesa-dev] [PATCH] r600g: order atom emission v2
Jerome Glisse
j.glisse at gmail.com
Thu Sep 6 18:39:34 PDT 2012
On Thu, Sep 6, 2012 at 8:32 PM, Dave Airlie <airlied at gmail.com> wrote:
> On Fri, Sep 7, 2012 at 10:03 AM, Marek Olšák <maraeo at gmail.com> wrote:
>> On Fri, Sep 7, 2012 at 12:05 AM, Jerome Glisse <j.glisse at gmail.com> wrote:
>>> On Thu, Sep 6, 2012 at 4:10 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>> On Thu, Sep 6, 2012 at 8:34 PM, Jerome Glisse <j.glisse at gmail.com> wrote:
>>>>> On Thu, Sep 6, 2012 at 2:29 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>> This looks good to me. It's funny to see the r300g architecture being
>>>>>> re-implemented in r600g. :)
>>>>>>
>>>>>> There's one optimization that r300g has that this patch doesn't. r300g
>>>>>> keeps the index of the first and the last dirty atom and the loops
>>>>>> over the list of atoms look like this:
>>>>>> for (i = first_dirty; i <= last_dirty; i++)
>>>>>>
>>>>>> And after emission:
>>>>>> first_dirty = some large number;
>>>>>> last_dirty= 0;
>>>>>>
>>>>>> The atoms should be ordered according to how frequently they are
>>>>>> updated (except when the ordering is required by the hw). But most
>>>>>> importantly, if there are no state changes, the loops are trivially
>>>>>> skipped.
>>>>>>
>>>>>> Marek
>>>>>
>>>>> Don't think this optimization is worth it, there won't be much more
>>>>> than 32 atom in the end and it definitely can't be ordered from most
>>>>> frequent to less frequent as some of the stuff need to be at the last
>>>>> being emitted and they are frequent one (primitive type for instance).
>>>>
>>>> I didn't say all atoms *must* be sorted. I meant that some (most?)
>>>> atoms can be sorted, i.e. you can have some atoms at fixed positions
>>>> (like the primitype type or the seamless cubemap state), but you have
>>>> always at least *some* freedom where you put the rest. The ordering I
>>>> had in mind was actually from the least frequent to the most frequent,
>>>> in other words, from the framebuffer (least frequent) to shaders to
>>>> textures to constant buffers to vertex buffers (most frequent).
>>>>
>>>> Of course, the code should document which atoms must have fixed
>>>> positions along with an explanation. The comment that all atom
>>>> positions must not be changed isn't enough, because it's not true.
>>>>
>>>> Marek
>>>
>>> I won't try to find which atom can have complete floating position, i
>>> am just grouping together register that are always emitted together in
>>> fglrx and then i position this group relative to each other according
>>> to fglrx position. That means all atom are always emitted in a
>>> specific order. So there won't be any freedom. The only freedom i can
>>> think of is btw 2 position forced atom and that make the sorting
>>> completely useless and complicated.
>>
>> I'll add the optimization anyway (without sorting). Draw operations
>> without state changes or with only one state update are quite common.
>>
>> Anyway, it was said in the v1 thread that the hardware doesn't need
>> any specific ordering for proper functioning. While it may be
>> beneficial to emit one or two registers earlier than the others,
>> insisting on fixed ordering of all of them is not only limiting, it
>> seems useless and waste of time as well. What I don't understand: Why
>> do you blindly copy everything fglrx *seems* to be doing without any
>> real reason? It does not fix any bug, it does not improve performance,
>> it does not clean up the code... so why? I am all ears.
>
> At the very least, please document a list of lockups this avoids. Less
> magic more text.
>
> Dave.
I am doing all this for hyperz. So if it only fix hyperz that make me
happy enough.
Cheers,
Jerome
More information about the mesa-dev
mailing list